Technique for efficiently classifying packets using a trie-indexed hierarchy forest that accommodates wildcards

ABSTRACT

A technique, specifically apparatus and accompanying methods, which utilizes a trie-indexed hierarchy forest (&#34;rhizome&#34;) that accommodates wildcards for retrieving, given a specific input key, a pattern stored in the forest that is identical to or subsumes the key. The rhizome contains a binary search trie and a hierarchy forest. The search trie provides an indexed path to each unique, most specific, pattern stored in a lowest level of the hierarchy forest and also possibly to increasingly general patterns at higher levels in the pattern hierarchy. The hierarchy forest organizes the patterns into nodal hierarchies of strictly increasing generality. For use as a packet classifier, the rhizome stores wildcard-based packet classification patterns at separate corresponding pattern nodes, along with, corresponding &#34;reference&#34; fields associated therewith. Operationally, as each different queue is established or removed, a corresponding classification pattern is either inserted into or removed from the rhizome. A search key is formed for each packet, typically by concatenating classification fields, e.g. source and destination addresses and source and destination port designations, appearing in a header of the packet. The search key is then applied to the rhizome to determine whether that key exists therein, by virtue of either matching an identical classification pattern or being completely subsumed within a more general pattern stored therein. When such a classification is found, the classifier returns the contents of the associated reference field, which for scheduling, is a designation of a transmission queue to which the packet is to be directed.

BACKGROUND OF THE DISCLOSURE

1. Field of the Invention

The invention relates to a technique, specifically apparatus andaccompanying methods, which utilizes a trie-indexed hierarchy forestthat accommodates wildcards for, inter alia, retrieving, given aspecific input key, a pattern stored in the forest that is identical toor subsumes the key. This technique finds particular, though notexclusive, use in a computer-based network, for efficiently classifyingpackets in order to apply any network related processing applicable tothe packet, e.g., packet scheduling. Thus, this invention also relatesto apparatus for a packet classifier and accompanying methods for usetherein that embody this technique.

2. Description of the Prior Art

Packets are routinely used within a computer network to carry digitalcommunication between multiple nodes on the network. Any such networkpossesses a finite number of different connections among the nodestherein. Generally speaking, whenever a computer stationed at one suchnode is executing an application wherein the results of that applicationwill be sent to a computer at another network node, the former computerwill establish a connection, through the network, to the latter computerand then send these results, in packetized form, over that connection.

A network interface for a personal computer contains both specializedhardware, as required to physically and electrically interface thatcomputer to the network itself, and associated software which controlsthe interface and governs packetized communication therethrough.Inasmuch as the interface hardware is irrelevant to the presentinvention, we will not discuss it in any detail. The software is oftenpart of the operating system, such as exists in, e.g., the Windows NToperating system currently available from the Microsoft Corporation ofRedmond, Washington (which also owns the registered trademark "WindowsNT"). In particular, the software implements, inter alia, variousprocesses that rely on classifying each packet and processing the packetaccordingly. One such process is packet scheduling. While several othersuch packet-related processes, such as routing, security and encryption,are also employed in the software, for purposes of brevity andillustration, we will confine the ensuing discussion to scheduling.

In particular, while a physical network interface to a computer operatesat a single speed, e.g., 10 or 100 Mb/second, several different streamsof packetized traffic, depending on their data content, can beinterleaved together by the computer for simultaneous transmission atdifferent rates through that interface. To accommodate this, a packetscheduler directs individual packets in each such stream to acorresponding software transmission queue (also referred to hereinafteras simply a "queue") from which packets in each such stream will bedispatched, in proper order, for network transmission at a correspondingrate--regardless of the network destinations of these streams. Given thesubstantially increased data rate required for, e.g., video data overtextual data, the scheduler encountering both video and textualpacketized data will ensure that each type of data packet is directed tothe proper queue such that a significantly greater number ofvideo-carrying packets than text-carrying packets will subsequently betransmitted per unit time through the interface. Other softwareimplemented processes successively pull individual packets from thequeues, at the corresponding data rates associated therewith, formultiplexing and subsequent transmission through the interface to thenetwork. Inasmuch as these other processes are not relevant to thepresent invention, they will not be discussed in any further detail.

Packet classification, for purposes of packet scheduling, will beperformed, by constructing a so-called "key" from select fields,contained in a packet in order to associate the packet with acorresponding queue. These fields are illustratively source anddestination addresses (typically IP addresses) and corresponding portdesignations (all of which are collectively referred to herein as"classification fields"). This association typically occurs byconsistently concatenating the classification fields, in the packetbeing classified, into a key which, in turn, is used to access a datastructure in order to retrieve therefrom an identification of acorresponding queue for that packet. Since a group of packets that havediffering values for their classification fields can nevertheless betransmitted at the same rate and hence should be directed to the samequeue, a mask field containing one or more so-called "wildcard"(oftentimes referred to as "don't care") values is often used, throughlogical combination with the classification fields, to yield anidentification associated with a single queue. Generally speaking, thisidentification is viewed as a "classification pattern", i.e., a bitfield having a length equal to the total length of the concatenatedclassification fields wherein any bit in the pattern can have a value of"1", "0" or "X", where "X" is a wildcard. As a result, a single patternhaving a wildcard(s) therein can serve to classify an entire group ofsuch packets. If a match occurs between the non-wildcard bits of thepattern (i.e., taking the wildcard value(s) into account) andcorresponding bits of the classification fields for the packet beingclassified, then an associated queue designation for that pattern isaccessed from the data structure.

By virtue of permitting wildcards within packet classifications (i.e.,patterns), the complexity associated with searching the data structure,for a pattern given the classification fields for a packet, as well asthat of the structure itself, increases considerably. Furthermore, theprocess of classifying packets lies directly within the data flow andadds another layer of processing thereto, i.e., each outgoing packetmust be classified before it is launched into the network. Consequently,any added complexity necessitated by accommodating wildcards willrequire additional processing. Since only a finite amount of processingtime can be allocated to classify each packet and packet classificationtends to be processor intensive, such classification needs to be ratherefficient--particular for use in a computer that is to experiencesignificant packet traffic.

A principal way to increase classification efficiency is to utilize aprocess that retrieves stored classification information as fast aspossible from a data structure.

While the art teaches various approaches for classifying packets, suchas that typified in M. L. Bailey et al, "Pathfinder: A Pattern-BasedPacket Classifier", Proceedings of First Symposium on Operating SystemsDesign and Implementation (OSDI), USENIX Assoc., 14-17 November 1994,pages 115-123 (hereinafter the "Bailey et al" paper), and W. Doeringeret al "Routing on Longest-matching Prefixes", IEEE/ACM Transactions onNetworking, Vol. 4, No. 1, February 1996, pages 86-97 (hereinafter the"Doeringer et al" paper), these approaches, while efficient andeffective in their target environments, are limited. In that regard, thetechniques described therein exhibit retrieval times that are generallylinearly related to the number of elements (n) in a classificationdatabase. A large network will have a substantial number of differentpatterns. Hence, the size of a classification data structure for such anetwork can be considerable which, in turn, will engender linearlyincreasing and hence relatively long retrieval times as the databaseexpands. For packetized payload data that requires a high data rate,such retrieval times may be inordinately long and cause excessive delayto a recipient user or process.

Furthermore, packet classifiers fall into two distinct types:declarative and imperative. Basically, a declarative classifier stores apattern as a filter while the filter, used in an imperative classifier,contains a small segment of executable code. In particular, adeclarative classifier relies on embedding a description, such as a key,in each packet for which a classification is sought and then matchingthe key to a pattern in a stored classification and retrieving anassociated value, such as a queue designation, therefrom. An imperativeclassifier, on the other hand, executes, possibly on an interpretivebasis, the code segment in each and every stored classification, inseriatim, against the header of an incoming packet to compute a result.The result specifies a corresponding pattern for that packet. Where thenumber of patterns is rather small, an imperative classifier executes arelatively small number of code segments against each packet. Inasmuchas each code segment is implemented through a highly simplifiedinstruction set, the classification delay for small networks, withrelatively few patterns, tends to be tolerable. However, both processingcomplexity and associated delay become intolerable for packetclassification in a large network that has an extensive number ofdifferent patterns. Declarative classifiers eliminate the need toprocess each incoming packet through a separate code segment for eachand every classification, and hence provide significantly shorterresponse times. However, prior art declarative classifiers, such as thattypified by the methodology described in the Bailey et al paper, exhibitclassification delay that is linear in the number of stored patterns andhence can be intolerably long in a large network which is expected tocarry packetized payload data, such as video, that requires a high datarate.

While the Doeringer et al paper describes a declarative classificationmethodology that can handle a pattern containing wildcards, thismethodology requires that all wildcards be located in a contiguous groupat the end of the pattern. Inasmuch as this methodology is directed toIP (Internet Protocol) routing where arbitrarily placed wildcards do notoccur, this methodology is acceptable in that application. However, forpacket classification, where classification can occur for variousdifferent purposes, such as scheduling, and a wildcard can arbitrarilyoccur anywhere in a pattern, the methodology taught by the Doeringer etal paper is simply unsuited for such broad-based classification.

Therefore, a need exists in the art for a fast and versatile packetclassifier that incorporates a search technique capable of rapidlyretrieving stored information, from a data structure, given a specificvalue in a group of classification fields in the packet. Furthermore,this technique should accommodate a wildcard(s) located in any arbitraryposition within a stored pattern and, to reduce delay, should preferablyexhibit retrieval times that increase less than a linear function of thenumber of stored patterns.

Moreover, and apart from use in packet classification, a broad needcontinues to exist in the art for a generalized technique that, givenspecific input data (in the form of, e.g., a key), can rapidly retrievea stored pattern, containing a wildcard(s) at any location therein, froma data structure--regardless of what the input data and patternsspecifically represent. Such a technique would likely find widespreaduse including, but not limited, to packet classification. While theopposite problem of how to rapidly find a specific datum stored in adatabase given a generalized input pattern has been extensivelyinvestigated in the art, the converse problem, inherent in, e.g., apacket classifier, of how to swiftly retrieve a generalized storedpattern from a data structure given specific input data, e.g.,classification fields, has apparently received scant attention in theart, which thus far, as discussed above, has yielded rather limited anddisappointing results.

SUMMARY OF THE INVENTION

Our present invention satisfies these needs and overcomes thedeficiencies inherent in the art through a stored data structurepredicated on our inventive rhizome-indexed hierarchy forest,specifically our inventive modified hierarchical Patricia tree. In thatregard, by structuring stored pattern data through a binary search trie,i.e., a search structure, that indexes into pattern hierarchies, i.e., apattern hierarchy structure, wherein the hierarchies accommodatewildcard(s), the structure can be used to retrieve a pattern stored inthe forest, given a specific input key, that is identical to or subsumesthe key. Advantageously, our inventive structure permits a wildcard tobe stored at any arbitrary bit position in a pattern.

In accordance with our broad teachings, our inventive data structure(also, for simplicity, referred to herein as a "rhizome"), typicallystored in computer memory and/or on a computer-readable storage medium,contains two interconnected data structures: a binary search trie(forming a "search structure" when stored in memory and/or on a readablestorage medium) and a hierarchy forest (forming a "hierarchy structure"when stored in memory and/or on a readable storage medium). The searchtrie, implemented through, e.g., a generalization of a Patricia tree,provides an indexed path to each unique, most specific, pattern storedin a lowest level of the hierarchy forest. The hierarchy forest, whichextends beyond the search trie, organizes the patterns into nodalhierarchies of strictly increasing generality, starting with those thatare most specific. Each pattern is stored in a separate pattern node.Each hierarchy of pattern nodes, at its lowest level, contains a nodewith a specific lowest level pattern. That node is then followed in thehierarchy by nodes, to the extent they exist, with increasingly generalpatterns. Each of these latter patterns, owing to an includedwildcard(s), completely subsumes, by virtue of strict hierarchy, thepattern in the node situated immediately therebelow. By virtue ofaccommodating hierarchically related wildcard-based patterns, ourinventive rhizome permits multiple nodes, either branch or patternnodes, to point to a common pattern node.

By virtue of our specific inventive teachings, given a specific input("search") key, a stored pattern, that provides the most specific match,either identically or which completely subsumes the key, is retrievedfrom the stored database through a two-fold retrieval process. First,the search trie, having a topology of branch nodes, is traversed bybranching, via child paths, at successive branch nodes therein asdetermined by corresponding bits in the key, each such bit beingdesignated by a so-called "pivot bit" associated with each such branchnode, until a lowest-level pattern node is reached. The pivot bitdefines a bit position at which at least a pair of stored patterns inthe rhizome disagree with each other. The branch nodes are organized interms of increasing pivot bit value. Second, the pattern hierarchy,containing that lowest-level pattern node, is itself searched until themost specific pattern therein is found which matches or completelysubsumes the key. A search ultimately succeeds if the retrieval processlocates a stored pattern that most specifically matches, eitheridentically or in terms of subsuming, the search key. If a searchsucceeds, then contents of a so-called "reference field" associated withthe stored pattern are returned for subsequent use.

Our inventive rhizome exhibits the advantageous characteristic that thekey will match either a specific pattern stored in the rhizome or anincreasingly general pattern stored above it in a hierarchical chain, orthat key will not have a matching pattern stored anywhere in therhizome. Hence, our inventive rhizome effectively narrows down a searchfor a pattern that most specifically matches a key from an entiredatabase of stored patterns to just a linear chain of patterns in orderof strict hierarchy, the latter being usually substantially smaller insize than the former. Specifically, through our inventive rhizome, themaximum number of bit comparisons that needs to be made to retrieve apattern for a given search key generally equals the number of bits inthe key itself plus the number of patterns, in the longest hierarchychain in the hierarchy forest, less one. Moreover, owing to the stricthierarchy inherent in every hierarchical chain in our inventive rhizome,i.e., where patterns at higher levels in a given pattern hierarchycompletely subsume those at lower levels therein, then, if a bit-by-bitcomparison is to be made for patterns in a pattern hierarchical chainagainst an input key, then as increasingly general patterns areencountered in that hierarchy, there is no need to compare a bit in anybit position of the key that has previously been compared against andfound to match that in any lower level pattern in that chain. This, inturn, further reduces processing time required to find a desired patternin the rhizome.

Advantageously, our present invention finds particular, though notexclusive, use in a packet classifier for efficiently, i.e., veryrapidly, classifying packets for scheduling. In such an application, ourinventive rhizome stores packet classifications as individualclassification patterns therein, wherein each classification (i.e.,pattern) is typically formed of, e.g., source-destination addresses andsource-destination port designations (all collectively referred to as"classification fields"). Depending upon a specific implementation, theclassification field may also contain: other fields from a packet inaddition to the addresses and port designations, fewer fields than theseaddresses and port designations, or even other fields from a packet.Wildcard-based classifications, i.e., patterns containing one or morewildcards, occur inasmuch as packets with different classificationfields can be directed to a common transmission queue; hence, having acommon classification. Each different pattern is stored at a separatecorresponding pattern node in the inventive rhizome, along with, as thecorresponding reference field therewith, a queue designation. Packetsare dispatched from each queue for network transport at a particulardata rate associated with that queue. In operation, as each differentqueue is established or removed, a corresponding classification patternis either inserted into or removed from the classifier, specifically ourinventive rhizome therein, respectively. Our inventive packet classifierforms a search key for each packet, typically by consistentlyconcatenating the classification fields appearing in a header of thepacket. The search key is then applied to our inventive rhizome whichdetermines whether that key exists in the rhizome, by virtue of eithermatching an identical classification pattern, i.e., a match of allnon-wildcard bits in the key with corresponding bits in the pattern,stored therein or being completely subsumed by a stored classificationpattern. In either case, if such a classification pattern is found forthe key, the search succeeds. Hence, the classifier then returns thecontents of associated reference field therefor to specify the queue towhich the packet is then directed.

Our inventive rhizome provides an advantageous feature that, for largedatabases, asymptotically, average search delay through the rhizome is alogarithmic function of a number of patterns stored in the rhizomerather than linear in the number (n) of patterns, as occurs withconventional databases of patterns containing wildcards. Hence, use ofour inventive rhizome, particularly with a large and increasingclassification database--as frequently occurs in packet classifiers,appears to provide significant time savings over conventional packetclassification techniques.

Another advantageous feature provided by our inventive rhizome isbackward compatibility with a conventional Patricia tree. In particular,if no wildcard-based patterns exist within our inventive rhizome or areto be inserted therein, then the structure of this rhizome would beidentical to that of a conventional Patricia tree containing thesepatterns thereby assuring full backward compatibility. Furthermore, ourinventive retrieval, insertion and removal processes would each operateon this rhizome, in the absence of any wildcard-based patterns, withsubstantially, if not totally, the same degree of computationalcomplexity as would the corresponding processes in a conventionalPatricia tree.

BRIEF DESCRIPTION OF THE DRAWINGS

The teachings of the present invention can be readily understood byconsidering the following detailed description in conjunction with theaccompanying drawings, in which:

FIG. 1 depicts a high-level block diagram of two illustrative networkedcomputers which each utilizes the teachings of the present invention;

FIGS. 2A and 2B depict typical key 200, and pattern 250 and itsaccompanying mask field 255, respectively, used in classifying a packetfor scheduling;

FIG. 3 depicts a block-diagram of illustrative computer system 100,shown in FIG. 1, which utilizes the teachings of the present invention;

FIG. 4 depicts a simplified block diagram of network support software112 that forms part of operating system (O/S) 110 shown in FIGS. 1 and3;

FIG. 5A depicts a simplified high-level diagram of a packet schedulerthat utilizes our inventive packet classifier;

FIG. 5B depicts a high-level flowchart of our inventive packetclassification process 550 performed by packet classifier 410 shown inFIGS. 4 and 5A;

FIGS. 6A and 6B graphically depict a conventional database retrievalproblem and an inverse problem inherent in a packet classifier,respectively;

FIGS. 7A and 7B depict single pattern 710, containing so-calledwildcards, and resulting patterns 750 all subsumed therein,respectively;

FIGS. 8A, 8B and 8C respectively depict a group of wildcard-containingpatterns, the same group hierarchically organized as a tree, andfour-bit keys that are mapped into the patterns;

FIGS. 9A and 9B depict illustrative conventional Patricia trees 900 and960, respectively, with exemplary data retrieval operations showntherefor;

FIG. 9C depicts data stored at each node, either branch or data node, ofa conventional Patricia tree, such as that illustratively shown ineither FIG. 9A or FIG. 9B;

FIG. 9D depicts data stored at each node of conventional Patricia tree900 shown in FIG. 9A;

FIG. 9E depicts a conventional node insertion operation performed onsub-tree portion 902 shown in FIG. 9D;

FIG. 9F depicts a conventional node removal operation performed onsub-tree portion 902 shown in FIG. 9D;

FIG. 10 depicts illustrative rhizome 1000 which embodies the teachingsof our present invention;

FIGS. 11A and 11B diagrammatically depict, through corresponding Venndiagrams, permitted hierarchical relationships between a pair ofpatterns stored in our inventive rhizome and specifically withinhierarchy forest 1000₂ shown in FIG. 10;

FIG. 11C diagrammatically depicts, through a Venn diagram, disallowedrelationships between a pair of patterns stored in our inventivestructure;

FIG. 12A depicts sub-structure portion 1002, as shown in FIG. 10, priorto a node insertion operation;

FIG. 12B depicts sub-structure portion 1002' which results after a nodehas been inserted into sub-structure 1002 shown in FIG. 12A, along withcorresponding contents of VALUE, MASK and IMASK fields of each of thenodes shown in sub-structure portion 1002' both before and after theinsertion;

FIG. 12C depicts sub-structure portion 1002' prior to removal of node1220 therefrom;

FIG. 12D depicts sub-structure portion 1004 of rhizome 1000 shown inFIG. 10;

FIG. 12E depicts sub-structure portion 1004 in a re-oriented positionprior to insertion of a new pattern node therein;

FIG. 12F depicts sub-structure portion 1004' which results afterinsertion of new pattern node 1230 into sub-structure portion 1004 shownin FIG. 12E;

FIG. 13 graphically depicts data fields associated with each node in ourinventive rhizome;

FIG. 14 depicts a flowchart of Retrieval routine 1400 which executeswithin our inventive pattern classifier 410 shown in FIG. 4;

FIG. 15 depicts a flowchart of Insertion routine 1500 which is executedby our inventive pattern classifier 410 shown in FIG. 4;

FIG. 16 depicts the correct alignment of the drawing sheets for FIGS.16A and 16B;

FIGS. 16A and 16B collectively depict a flowchart of Node Insert routine1600 that is executed by, e.g., Insertion routine 1500 shown in FIG. 15;

FIG. 17 depicts a flowchart of Removal routine 1700 which is alsoexecuted by our inventive pattern classifier 410 shown in FIG. 4;

FIG. 18 depicts a flowchart of Node Remove routine 1800 that is executedby, e.g., Removal routine 1700 shown in FIG. 17;

FIG. 19 depicts a flowchart of Replicate routine 1900 that is executedby, e.g., Node Insert routine 1600 shown in FIGS. 16A and 16B; and

FIG. 20 depicts Eliminate routine 2000 that is executed by, e.g., NodeRemove routine 1800 shown in FIG. 18.

To facilitate understanding, identical reference numerals have beenused, where possible, to designate identical elements that are common tothe figures.

DETAILED DESCRIPTION

After considering the following description, those skilled in the artwill clearly realize that the teachings of our present invention can bereadily utilized in substantially any environment that processes packetsto expedite packet classification. Our present invention willadvantageously function not only within a computer system but also,generally speaking, within any apparatus that applies substantially anynetwork-related processing to packets. Furthermore, though the broadteachings of our invention, particularly our inventive rhizome, aresuitable for advantageous use within a wide variety of computer-baseddata classification processes to locate a stored generalized patterngiven specific input data, to facilitate understanding and for brevity,we will describe our invention in the context of illustratively andpredominantly a computer system which, as part of its operating system,classifies packets, for scheduling.

A. Packet Classification in a Network Environment

FIG. 1 depicts a high-level block diagram of networked environment 5which illustratively contains computers 80 and 100 inter-connectedthrough network 60. Inasmuch as the specific components that constitutenetwork 60 are irrelevant to the present invention, these aspects willnot be addressed hereinafter. Computers 100 and 80 are connected, viaappropriate electrical connections 50 and 70, to network 60. Theseconnections can be, e.g., wired, i.e., physical, or wireless links, thelatter including radio and/or optical links; the actual form of the linkbeing immaterial to the present invention.

Suffice it to say and to the extent relevant, computer system 80 locatedat a receiving location (henceforth a "receiver") can obtain data, suchas, e.g., video or text data, from, e.g., application programs 130located at a sending location (henceforth a "sender")--the actualmechanism through which the data is requested being irrelevant to thepresent invention. To provide network connections, computer system 100also provides within its executing operating system 110 appropriatenetwork support software 112. This network support software is oftenpart of operating system 110, which is typified by, e.g., the Windows NToperating system currently available from the Microsoft Corporation ofRedmond, Washington (which also owns the registered trademark "WindowsNT"). For a clear understanding, the reader should also simultaneouslyrefer, throughout the following discussion, to both FIGS. 1 and 4, thelatter depicting a simplified block diagram (to the extent relevant) ofnetwork support software 112.

Network support software 112 provides, inter alia, packet classificationand packet scheduling functionality, implemented through packetclassifier 410 and packet scheduler 450, respectively. In operation andin essence, the network support software receives, as symbolized by path120, a stream of packets from, e.g., application programs 130, and then,for purpose of scheduling, classifies, through packet classifier 410,each such packet as to its corresponding transmission queue (alsoreferred to herein as simply a "queue"). The scheduler then directs eachsuch packet to its corresponding queue. Thereafter, as symbolized bydashed line 140, the packets in each queue (not specifically shown ineither FIGS. 1 or 4, but will be discussed in detail below inconjunction with FIG. 5A) are collectively interleaved together andtransmitted onto network 60 at a corresponding data rate for each queue.Thus, through packet classification, scheduler 450 assures that theindividual packets are to be launched into network 60 at a ratecommensurate with a data rate of their payload data, Each queue maycontain packets destined for different corresponding networkedcomputers, e.g., computer 80 and/or any other computer(s) connected tonetwork 60.

In essence, packet classification will be performed, for purposes ofscheduling, by using values of specific classification fields containedin header of a packet, and specifically concatenating these fieldstogether, in a consistent manner, to form a so-called "key" which, inturn, is used to access a data structure to retrieve a designation of aproper transmission queue. These classification fields that formillustrative key 200 are graphically shown in FIG. 2A.

In particular, each packet contains source and destination IP (InternetProtocol) addresses, i.e., an network address of a computer or othernetwork device ("source device") at which that packet was created and anetwork address of a computer or other network device ("destinationdevice") at which that packet is to be delivered. The packet alsocontains designations of source and destination ports, i.e., specificports at the source and destination devices at which the packetoriginated and at which the packet is to be received, respectively. Thesource and destination addresses, and source and destination portdesignations collectively form the "classification fields". In otherimplementations, the classification fields may contain: other fieldsfrom a packet in addition to these addresses and port designations,fewer fields than these addresses and port designations, or even otherfields from a packet. In any event, for purposes of the presentembodiment, these addresses and port designations, when appropriatelyconcatenated to form key 200, are used to access a so-called"classification pattern" from a stored database. As shown, key 200contains four successive contiguous fields as the classification fields:source port designation field 210, destination port designation field220, source address field 230 and destination address field 240. Thoughclassification patterns are a specific type of pattern, for brevity, wewill use these two terms synonymously during the course of describingour invention in the context of a packet classifier.

For each different transmission queue established at a computer, aseparate corresponding pattern exists and is stored within a database.Moreover, multiple stored patterns may often correspond to the samequeue. In any event, complications arise inasmuch as packets withdifferent values for their classification fields are often to betransmitted at a common data rate, and hence are to be directed to acommon transmission queue. In order to direct packets with differentvalues of the classification fields to a common queue, each pattern,such as pattern 250 shown in FIG. 2B, has mask field 255 associatedtherewith, wherein a separate mask sub-field is provided for each of theindividual classification fields, i.e., for each of the source anddestination address fields and the source and destination portdesignation fields. As shown, pattern 250 is formed of: source portdesignation field 264, destination port designation field 274, sourceaddress field 284 and destination address field 294; with mask field 255being formed, with corresponding concatenated sub-fields, of: sourceport designation mask sub-field 267, destination port mask sub-field277, source address mask sub-field 287 and destination address masksub-field 297.

Each mask sub-field identifies, by a one in a given bit position, eachbit that is important (i.e., must be matched in value) in acorresponding classification field, (i.e., an address or portdesignation), and, by a zero in a given bit position, which bit in thatclassification field can be ignored, i.e., that which is a so-called"don't care" bit or (as will be equivalently referred to hereinafter)"wildcard". For an example shown in FIG. 2B, consider, as example 285,an address in source address field 284 to be the value 101001010 and acorresponding mask in mask sub-field 287 to be the value 111001110. Whatthis means is that the source address bit-values in address bitpositions (starting at MSB (most significant bit) position zero on theleft and counting to the right) three, four and eight can be ignored.Hence, the source address becomes 101XX101X where "X" represents a"wildcard". Hence, any packet having a source address that contains thevalues "101" in bit positions zero through two and again in bitpositions five through seven, regardless of its bit values in bitpositions three, four and eight, will match the source address inpattern 250. By virtue of having a wildcard(s), a single source addressin a pattern, such as that in pattern 250, can match against multipledifferent source addresses in packets. To specify just one particularsource address rather than a group, all bits in the mask sub-field forthe source address in a pattern would be set to one, i.e., denoting nowildcards; hence requiring a match over all bits in the address fieldwith a corresponding source address in a key for a packet. The sourceport designation, destination port designation and destination addressare all identically specified for a pattern through by a combination ofan address or port designation in fields 264, 274 and 294, respectively,with a corresponding value in mask sub-fields 267, 277 and 297,respectively, of mask field 255.

As noted above, once the database search successfully retrieves apattern that matches or subsumes within it the key for a packet, thequeue designation associated with that pattern specifies a particulartransmission queue, within the scheduler, to which that packet will bedirected. The queue designations are simply predefined and stored in thedatabase. Since the manner through which the bits in the queuedesignations are set to specify an associated transmission queue isirrelevant to the present invention, we will omit all those details fromthe ensuing discussion.

By virtue of permitting wildcards within patterns, the complexityassociated with searching the data structure, for a pattern given a key,as well as that of the structure itself increases considerably.Furthermore, the process of classifying packets lies directly within thedata flow and adds another layer of processing thereto, i.e., eachoutgoing packet must be classified before it is launched into thenetwork. Consequently, since packet classification is processorintensive, then any added complexity necessitated by accommodatingwildcards will require additional processing. Since only a finite amountof processing time can be allocated to process each packet, packetclassification should proceed as efficiently as possible--particularlyfor computers that are expected to handle significant packet traffic.

As discussed in much greater detail below and in accordance with ourinventive teachings, we teach an inventive data structure which canrapidly classify specific data to associated wildcard-based patterns.This structure is used illustratively in a packet classifier, but it canbe employed in any application that requires storage and rapid retrievalof any patterns containing wildcards. However, prior to discussingspecific details of our invention, we will digress somewhat to describethe salient aspects of a typical computer system, such as system 100shown in FIG. 1, which utilizes our invention.

In that regard, FIG. 3 depicts a block-diagram of illustrative computersystem 100, shown in FIG. 1, which utilizes the teachings of the presentinvention.

As shown, this system, illustratively a personal computer, comprisesinput interfaces (I/F) 310, processor 320, communications interface 330,memory 340 and output interfaces 370, all conventionally interconnectedby bus 380. Memory 340, which generally includes different modalities(all of which are not specifically shown for simplicity), illustrativelyrandom access memory (RAM) and hard disk storage, stores operatingsystem (O/S) 110 and application programs 130. As noted above, thespecific software modules that implement our inventive teachings areincorporated within O/S 110. This operating system, apart from thesemodules, may be implemented by any conventional operating system, suchas the WINDOWS NT operating system. Given that, we will not discuss anycomponents of O/S 110 other than those needed to specifically implementour invention, inasmuch as the rest are irrelevant.

As shown in FIG. 3, incoming information can arise from two illustrativeexternal sources: network supplied information, e.g., from the Internetand/or other networked facility such as an intra-net (all generallyshown as network 60 in FIG. 1), through network connection 50 tocommunications interface 330 (shown in FIG. 3), or from a dedicatedinput source via path(s) 305 to input interfaces 310. Dedicated inputcan originate from a wide variety of sources, e.g., an externaldatabase, video feed, scanner or other input source. Input interfaces310 are connected to path(s) 305 and contain appropriate circuitry toprovide the necessary and corresponding electrical connections requiredto physically connect and interface each differing dedicated source ofinput information to computer system 100. Under control of the operatingsystem, application programs 130 exchange commands and data with theexternal sources, via network connection 50 or path(s) 305, to transmitand receive information typically requested by a user during programexecution.

Input interfaces 310 can also electrically connect, via leads 382, andinterface user input device 385, such as a keyboard and mouse, tocomputer system 100. Display 394, such as a conventional color monitor,and printer 398, such as a conventional laser printer, can be connected,via leads 392 and 396, respectively, to output interfaces 370. Theoutput interfaces provide requisite circuitry to electrically connectand interface the display and printer to the computer system. As one canappreciate, the particular type of input and output information and aspecific modality through which that information is applied to orproduced by system 100 are both immaterial for purposes of the presentinvention and thus will also not be discussed in any detail hereinafter.

In operation, typically one or more application programs 130, such as abrowser or other application that requests data, such as here videodata, from a remote server, execute under control of O/S 110. For eachexecuting application program, one or more separate task instances areinvoked by a user in response to each user specified command, typicallyentered interactively through appropriate manipulation of user inputdevice 385 given available command choices, such as in a menu or iconsin a toolbar, and accompanying information then presented on display394. Hardcopy output information from an executing application isprovided to the user through printer 398.

Furthermore, since the specific hardware components of computer system100 as well as all aspects of the software stored within memory 340,apart from the modules that implement the present invention, areconventional and well-known, they will not be discussed in any furtherdetail.

FIG. 5A depicts a simplified high-level diagram of packet scheduler 450that utilizes our inventive packet classifier 410 and forms part ofnetwork support software 112 shown in FIG. 4. For ease of understanding,the reader should also simultaneously refer to FIGS. 1, 3, 4 and 5Aduring the following discussion.

Packets destined for transmission over network 60 from computer 100 areapplied, as symbolized by line 405, to scheduler 450 located withinnetwork support software 112. These packets can contain multiple streamsof different packetized data, such as, for example, separate streams ofpacketized video data or packetized textual data. These streams,generated by a common computer and regardless of their data content, aretransmitted through a common network interface, via network 60, to thesame or different network destinations. Though a physical networkedconnection operates at a single speed, several different streams ofpacketized traffic, depending on their data content, are interleavedtogether by network support software 112 for simultaneous transmissionat different rates through the network interface--with the rate providedto each stream being governed by its data content. To accomplish this,scheduler 450 contains internal transmission queues 470, composed ofindividual queues 470₁, 470₂, . . . , 470_(n). Each queue is associatedwith a different transmission rate. For example, packets of, forexample, video data, to be transmitted at 3 Mb/second would be directedto queue 470₁ ; those packets to be transmitted at 2 Mb/second would bedirected to queue 470₂, and so forth for transmission speeds associatedwith the other queues. Packets to be transmitted at a relatively lowdata rate, such as packetized textual data, would be directed to queue470_(n). To properly direct each packet appearing on line 405, scheduler450 reads the classification fields from each such packet and forms akey therefor. This key is then passed, as symbolized by line 415, toclassifier 410. As discussed above, the classifier, in turn, searchesfor a pattern stored therein which either matches or subsumes the key.If such a key is found, classifier 410 accesses the queue designationassociated with that pattern and returns that designation, as symbolizedby line 420, to the scheduler. In response to this designation,demultiplexor 460 contained within the scheduler directs the packet tothe specific queue, i.e., one of queues 470, specified by thedesignation. By appropriately classifying each packet in successionbased on its classification fields and directing that packet to theproper queue therefor, scheduler 410 and classifier 450 collectivelyensure that each stream of packets will be scheduled for transmissionover network 60 at a rate commensurate with the data carried in thatstream. Other software implemented processes (not specifically shown butwell-known) successively pull individual packets from the queues, at thecorresponding data rates associated therewith, for multiplexing (ascollectively symbolized by dashed lines 480) and subsequenttransmission, via line 50 and through a network interface adapter(contained within communication interfaces 330) and accompanying networkcard driver 540, onto network 60. Inasmuch as these other processes arenot relevant to the present invention, they will not be discussed anyfurther.

FIG. 5B depicts a high-level flowchart of our inventive PacketClassification process 550 performed by our inventive packet classifier410 shown in FIG. 4.

Upon entry to this process--which is separately performed for eachpacket, block 555 is executed which extracts the fields of interest,i.e., the classification fields, from a header of that packet.Thereafter, the extracted fields are consistently concatenated, by block560, into a single key, i.e., a current search key, to form key 220shown in FIG. 2. To minimize search delay, recently used keys and theirassociated flows are cached within memory 340 (see FIG. 3) so as toadvantageously eliminate the need to conduct a complete databaseretrieval operation on each and every search key. Hence, once thecurrent search key is formed, block 565, as shown in FIG. 5B, examines acache to locate that key. Next, decision block 570 determines whetherthe cache contains the current search key. If it does, then executionproceeds, via YES path 572, to block 575 which retrieves the pattern,and its associated queue designation, for this packet from the cache.Alternatively, if the cache does not contain the current search key,then this decision block routes execution, via NO path 574, to block580. This latter block, when executed, retrieves the pattern for thecurrent block by searching our inventive data structure, also referredto herein as a "rhizome" and which will be discussed in considerabledetail below, for that pattern. Once this pattern is located, either byretrieving it from the cache or from our inventive rhizome, the queuedesignation therefor is then returned, through execution of block 590,to a requesting client, thereby completing execution of process 550 forthe packet. Though not specifically shown, once the pattern and thequeue designation are retrieved from our inventive rhizome, these itemsare then cached for subsequent short-term retrieval.

B. Conventional Database Searching vis-a-vis Packet ClassificationSearching

Our inventive data structure and retrieval algorithms, particularly foruse in packet classification, are directed to an entirely oppositeproblem to that routinely encountered with conventional databases.

In particular, and as shown in FIG. 6A, a conventional database,illustrated by database 630, contains specific pieces of information,such as entries 630₁, 630₂, 630₃ and 630₄, each of which stores adifferent numeric value. These values are illustratively 1011, 1010,1111 and 0110 for entries 630₁, 630₂, 630₃ and 630₄, respectively. Withsuch a conventional database, a generalized input pattern, such aspattern 610 here having an illustrative value 1X11 with a wildcard inbit position one, is applied; the database manager is instructed toreturn all stored data values that match, i.e., are identical to or aresubsumed by, the pattern. Accordingly, entries 630₁ and 630₃ will bereturned as containing values 1011 and 1111 that are both subsumedwithin input pattern 1X11.

In sharp contrast, we are faced with the problem, inherent in a packetclassifier and as illustrated in FIG. 6B, of searching a database ofgeneralized patterns for a classification pattern that matches, eitheridentically or subsumes within it, a specific input value, i.e., a key.Here, database 670 illustratively contains stored values "1XX10" inentry 670₁, "0X101" in entry 670₂, "01101" in entry 670₃ and "01001" inentry 670₄. Illustrative input key 650, containing a value "10110", isapplied to a retrieval process in an attempt to return a storedclassification pattern that is either identical to or contains the key.Here, the retrieval succeeds in returning the stored classificationpattern "1XX10" in entry 670₁. As one can appreciate, the stored andreturned classification pattern "1XX10", containing wildcards in bitpositions one and two, subsumes within it the input key "10110".

As noted, a specific input key can identically match or be subsumedwithin a generalized stored classification pattern. The number ofdifferent keys that will be subsumed within any given classificationpattern will be dictated by the number of wildcards contained withinthat pattern. If a stored classification pattern contains no wildcards,then only one key, i.e., that which identically matches the storedclassification pattern, will result in that pattern being returned.Alternatively, if a stored four-bit classification pattern contains allwildcards, i.e., the value "XXXX", then any possible four-bit key, for atotal of 16 different four-bit keys, will result in that storedclassification pattern being returned. In that regard, consider FIG. 7Awhich depicts stored eight-bit classification pattern 710 having threewildcards, in bit positions two, three and seven. As a result, eightdifferent eight-bit keys 750, as shown in FIG. 7B, when applied to adatabase containing stored classification pattern 710 will result inthat particular classification pattern being returned.

Classification patterns can also be arranged hierarchically as treeswith arbitrary depths and in terms of increasing generality. Forexample, FIG. 8A depicts six illustrative four-bit classificationpatterns 810. Three of these classification patterns do not contain anywildcards, while the remaining three contain one, two or threewildcards. These classification patterns result in a pattern hierarchy,i.e., a tree, shown in FIG. 8B having three levels; namely, level 832being the topmost, level 834 being an intermediate level and level 836being a lowest level. The lowest level in the tree contains thoseclassification patterns, here 1101, 0111 and 0101, that are the mostspecific, i.e., contain the least number of wildcards, i.e., here zero.Inasmuch as classification patterns 0111 and 0101 are both subsumedwithin pattern 01XX, and pattern 1101 is subsumed within pattern 10X,intermediate level 834 contains higher-level patterns 110X and 01XX.Because both of these higher-level patterns are themselves subsumedwithin classification pattern X1XX, the latter pattern is situated atthe highest level. Following a retrieval strategy that if any two ormore patterns match (or subsume) a key, the most specific pattern isselected first and applied as output, then for set 850 of sixteenpossible four-bit input keys, as shown in FIG. 8C, resulting patterns870 would be retrieved for those keys. A hierarchy tree need not bebinary or even regular; any shape at all can occur provided the treeconforms to the actual hierarchical relationship among the storedclassification patterns. Furthermore, a classification pattern does notneed to have a hierarchical connection to all or any other suchpatterns; therefore, a group of classification patterns may includeseveral different hierarchy trees. We will hereinafter refer to a groupof trees as a "forest" and a structure realized by a group ofclassification patterns as a pattern "hierarchy forest".

A pattern classifier functions by retrieving, for a given specific inputkey, the most specifically matching stored classification pattern fromamong all the classification patterns stored in a pattern hierarchyforest in a database.

C. Classification Data Structures

We have discovered that a very efficient data structure for patternclassification results from appropriately modifying, in accordance withour inventive teachings, a conventional trie, specifically a Patriciatree, to accommodate hierarchically-related classification patterns thatcontain wildcards. A trie is a search tree in which branches arefollowed, during retrieval, based upon a digital value of a search keyand wherein the data against which a key is ultimately matched is storedin the leaves of the tree. A Patricia tree is such a binary trie inwhich each internal node has only two children emanating therefrom andno one-way branching. A superset of a conventional Patricia tree, thesuperset which we have developed in accordance with our invention andwhich we will refer to hereinafter as a "rhizome", advantageouslyaccommodates wildcards in stored patterns--an accommodation whollyabsent from a conventional Patricia tree.

To simplify reader understanding, we will proceed from here by firstdiscussing, through examples, a conventional Patricia tree; followed bydiscussing our inventive rhizome; and finally discussing the variousinventive software routines that, in the context of use inillustratively a packet classifier for use in scheduling, execute withinnetwork support software 112 (see FIGS. 1 and 4) and implementretrieval, insertion and removal operations on our inventive rhizome.Our discussions of the conventional Patricia tree and of our inventiverhizome will also include graphical-based descriptions of the associatedretrieval, insertion and removal operations thereupon.

1. Conventional Patricia Tree

a. Overview

Any data structure designed for use in packet classification mustsupport retrieval of a stored classification pattern therefrom,insertion of a new classification pattern therein and removal of astored classification pattern therefrom. One data structure thatprovides relatively fast retrieval, insertion and removal operations isa Patricia tree.

As noted above, a Patricia tree is a binary trie. Each internal node inthis tree has exactly two branches emanating therefrom and contains abit index value. When a given node is reached, movement therefrom canoccur in either of two directions, namely along one branch to a child oralong the other branch to another child. The child can be either anotherinternal node or a leaf node. Each leaf in the tree contains a datavalue. Nodes extend outward from a root of the tree in a hierarchical,typically multi-level, fashion as one traverses a path through the tree.Each internal node contains a numeric bit index value which specifiesthe specific bit in an input key that governs movement from that node:if that bit has a one value, one branch is taken; else if it has a zerovalue, the other branch is taken, and so forth through successive nodesalong the path until a leaf node is reached. While the bit index for anychild is greater than that of its parent, the bit indices encounteredalong any such path need not be consecutive, merely ascending.

The manner through which nodes are interconnected to form a Patriciatree and the values of their associated bit indices are strictlydetermined by the value that is to be stored in each leaf node and bydisagreements, where they occur starting at the root and continuing ateach successively higher bit position, among all such leaf values. Eachbit index specifies a bit position at which such a disagreement exists.

With this in mind, consider FIG. 9A which depicts illustrativeconventional Patricia tree 900. This tree stores seven data values,labeled for simplicity as A (value 0000001000), B (value 0000101000), C(value 0000111000), D (value 0010001000), E (value 0010001100), F (value0010001101) and G (value 0011001000) stored in corresponding leaves.This tree, which is relatively simple, contains six internal nodes,hierarchically arranged in at most four levels, specifically nodes 905,915, 920, 935, 940 and 945. Each node has two branches, specifically a"zero" branch and a "one" branch, emanating therefrom, such as branches907 and 910 respectively emanating from node 905, and so forth for allthe other nodes. As noted above, the nodal pattern is determined by thebit positions at which disagreements exist among the bits of all thestored values, starting at the root and continuing throughout theremaining bit positions.

In that regard, considering values A-G, starting with the highest orderbit position (with the most significant bit (bit position zero) situatedat the far left) and progressing incrementally in a step-wise fashionthroughout the other bit positions from left to right, a disagreementamong bits for these stored values first occurs in bit position two.This position yields node 905, i.e., the root of the tree with a bitindex of two. Two branches emanate from node 905: zero branch 907 andone branch 910. For those values that have a zero in bit position two(i.e., values A, B and C), the next disagreement occurs in bit positionfour. Hence, a second node, here node 915, terminates zero branch 907.This second node, assigned a bit index of four, also has zero and onebranches emanating from it: here zero and one branches 917 and 919,respectively. Value A, being the only value with a zero in bit positionstwo and four, terminates zero branch 917 and hence is child zero of node915. Now, for those values that have a zero in bit position two and aone in bit position four, a further disagreement exists with respect tobit position five, i.e., values B and C respectively have a zero and oneat this bit position. Hence, a third node, here node 920 with a bitindex of five terminates one branch 919 (and becomes child one of node915) with value B terminating zero branch 922 (hence being a child zeroof node 920) and value C terminating one branch 924 (hence being a childone of node 924), and so forth.

Hence, given the differences among values A-G in bit positions 2, 3, 4,5, 7 and 9, conventional Patricia tree 900 results.

Another illustrative example of a Patricia tree is tree 960 depicted inFIG. 9B. Here, tree 960 contains two internal nodes: specifically node961, being the root; and node 967. This tree stores three values L(value 011011), M (value 011101) and N (value 111011). Inasmuch as thesethree values, L, M and N, have bit disagreements at bit position zeroand next at bit position three, nodes 961 and 967 have bit indices ofzero and three, respectively, thus forming tree 960 with three leavessufficient to store the three values L, M and N. In that regard, rootnode 961 has zero branch 965 and one branch 963 emanating therefrom,with node 967 terminating the zero branch and a leaf containing value Nterminating one branch 963. Node 967 has zero branch 969 and one branch968 emanating therefrom which terminate at corresponding leaf nodesstoring values L and M, respectively. While disagreements exist amongthese values at higher-order bits, such disagreements can be ignoredsince no further decision nodes are needed to provide a distinct paththrough tree 960 to each stored value.

b. Retrieval

Retrieving data from a Patricia tree involves searching the tree throughuse of a so-called key. Bit values of the key define a search path thatwill be traversed through the tree. Advantageously, Patricia treesexhibit a property that either the key will be identically stored in aleaf of the tree that terminates the path designated by the key or thekey will not exist in the tree at all. In that regard, a key defines oneand only one path through a conventional Patricia tree. Consequently, ifa value specified by a key is stored in a Patricia tree, then that valuecan only be accessed through one unique path, as defined by the key,through the tree. Each leaf stores not only such a value but also, as adata item, a reference field, wherein information stored in thereference field is returned if the search is successful. Usually, thereference field is the object of the search with the key being amechanism needed to uniquely traverse the tree. As we will describebelow in conjunction with our inventive rhizome, the reference fieldillustratively stores a designation of a transmission queue while thekey is correspondingly formed by appropriately concatenatingclassification fields.

Returning to FIG. 9A, assume that a search key of 0110010110 isprovided; the goal is to determine whether this key is stored in thisconventional Patricia tree. To determine this, one starts at the root oftree 900 and traverses along a search path therethrough. The search pathincrementally progresses through the tree in a branch-wise fashion withthe branch taken at each node being specified by a value of a bit in thekey, that bit being designated by the bit index of that node. Theretrieval operation ends when a leaf is reached. Specifically, giventhis search key, the root node of tree 900 carries a bit index of two,which, as symbolized by dashed line 903, designates bit two of the key.This particular bit is one which, in turn, causes the search path tothen traverse one branch 910 of tree 900. The entire search path forthis key is shown by arrows. The next node encountered is node 935 whichhas a bit index of three. Hence, bit three of the key is examined next.This particular bit in the key carries a zero value thereby causing thesearch path to next traverse zero branch 937 emanating from node 935.Next, node 940 is encountered which specifies a bit index of seven.Accordingly, bit position seven of the search key is examined. Inasmuchas the key has a one at this bit position, the search path traversesalong one branch 944 to node 945. This particular node carries a bitindex of nine. Bit position nine of the key contains a zero-bit; hencecausing the search path to traverse along zero branch 947 emanating fromnode 945. Branch 947 leads to a leaf node at which value E is stored.Once this leaf node is encountered, the search key is compared to thevalue, here E (value 0010001100), stored thereat. Inasmuch as the searchkey, having a value of 0110010110, does not identically match the storedvalue at this leaf (i.e., search key # E), the retrieval operationterminates with a result that the key is not present in the structure.Hence, the search was unsuccessful.

As one can clearly appreciate, many different search keys can oftenproduce the same search path through a conventional Patricia tree.However, the search will succeed if and only if, as a result of thecomparison operation, the stored value ultimately accessed at a leafidentically matches the key being searched. This is evident in FIG. 9B.Here, two search keys, i.e., keys 011011 (search key 1) and 010010(search key 2), are applied to tree 960. Both keys result in a commonsearch path, as given by arrows, through the tree, i.e., from root node961, along zero path 965 (zero in bit zero of the key), to node 967 andalong zero path 969 (zero in bit three of the key) emanating therefrom.This search terminates at a leaf node storing value L which isillustratively 011011. Inasmuch as value L matches search key 1(011011), a search on key 1 succeeds with the reference field (notshown) associated with value L being returned as a result. In contrast,the search on search key 2, which traverses the same search path assearch key 1, does not succeed. Specifically, this search progressesfrom root node 961, along zero path 965 (zero in bit zero of search key2), to node 967 and along zero path 969 (zero in bit three of the thiskey) emanating therefrom. This search also terminates at the leaf nodehaving stored value L. However, inasmuch as search key 2 (010010) doesnot identically match stored value L (011011), this search isunsuccessful.

c. Insertion

To sufficiently elucidate insertion and, as will be shortly discussedbelow, removal operations, we will digress slightly to first describe,prior to discussing insertion, the data stored at each node in aPatricia tree--inasmuch as this data is used and appropriately modifiedduring each of these operations.

FIG. 9C depicts this data as collectively data structure 970, arranged,for ease of understanding, in columnar form. In particular, datastructure 970 contains data 972 which is stored at a branch node, anddata 974 which is stored at each leaf (data node). Branch node dataincludes, in three separate fields, the pivot bit, and separate pointers(addresses) to child zero and child one of that node. The pivot bit isthe bit index value associated with a corresponding branch node. Eachdata node stores two fields: a value, e.g., a pattern, which isultimately matched against a search key, and a reference. The referencefield is immaterial for search purposes and can store any item. As notedabove, when a search of a key succeeds, the content of the referencefield, stored at a corresponding data node at which the searchterminates, is simply returned for subsequent use.

With the above in mind, FIG. 9D depicts tree 900 shown in FIG. 9A andspecifically the data stored at each node in that tree. A startingaddress at which the data structure for each node is stored in memory isdesignated by a lower case letter. Specifically, letters a, b, c, d, e,f and g designate these addresses for data structures 918, 923, 925,943, 948, 953 and 951 which store data for the leaves that store valuesA, B, C, D, E, F and G, respectively. For example, inasmuch as datastructure 918 exists for a leaf having stored pattern A, this structurestores value A, in lieu of a pivot bit, in a first field in thisstructure, and so forth for the data structures for the other leaves ofthe tree. To simplify presentation, data has been omitted from thereference field for all the data structures for the leaves of tree 900.Letters m, h, i, j, k and l designate starting addresses for datastructures 906, 916, 921, 941, 946 and 936, that store data for branchnodes 905 (root), 915, 920, 940, 945 and 935, respectively. Datastructure 906 stores data for the root node of the tree wherein thepivot bit contains the value two, and the following two fields instructure 906 contains addresses h and l as pointers to the child zeroan d child one nodes, respectively; address h being that for datastructure 916 associated with node 915, and address l being that fordata structure 936 associated with node 935; and so forth for all theother branch nodes in the tree. Inasmuch as the data structuresthemselves contain an implicit representation of the tree hierarchy,only the se structures themselves are stored in O/S 110 (see FIG. 3)within memory 340.

We will now turn our attention to the insertion operation. Forsimplicity of illustration, we will discuss both the insertion andremoval operations with specific reference to such operations performedon sub-tree portion 902, depicted in FIG. 9D, of conventional Patriciatree 900. Essentially, insertion involves creating a new data structuref or each new node and altering the contents, specifically the pointers,in appropriate data structures then existing within the sub-treeportion, as well as storing appropriate pointers in the new datastructure, to properly designate paths to and from the new node andexisting nodes impacted thereby. Each time a new value is to be added toa Patricia tree exactly one new branch node and one new leaf node arecreated. The number of different values stored in a Patricia tree, andhence the number of its leaf nodes, always equals one more than thetotal number of its branch nodes.

FIG. 9E depicts a conventional insertion operation performed on sub-treeportion 902. Here, illustratively, a node with a pivot bit of six is tobe inserted into sub-tree portion 902. This node, given the hierarchywithin portion 902, is to be inserted on the child zero branch 937between node 935, having a pivot bit of three, and node 940 having apivot bit of seven. As shown, prior to insertion, the child zero pointerfor node 935, specifically that stored within data structure 936, pointsto address j which is the starting address for data structure 941 fornode 940; hence, as shown by arrows, linking the data structures forthese two nodes together.

Inserting a node arises whenever a new and different value, heredesignated by H, is to be inserted within sub-tree portion 902. A newbranch node is created with a pivot bit designating a lowest bitposition at which a disagreement exists between the bits of the valuespresently stored in the sub-tree portion and the new value H. Here, sucha disagreement illustratively occurs at bit position six. Hence, newnode 982 (indicated by a dashed box) is created; memory is allocated fornew data structure 983 at an illustrative starting address n with thisnew structure being stored thereat. The pivot bit for this node is setto six, as indicated within structure 983. A new leaf node (alsoindicated by a separate dashed box), specifically its associated datastructure 989 containing value H, is also created with this datastructure being assigned illustrative starting address p. The pointersof the existing nodes and new node are now set appropriately to modifythe sub-tree to include the new node. In that regard, since new node 982is reachable through the child zero branch of node 935, then the childzero pointer field for node 935, specifically that stored within datastructure 936, is changed to point to new node 982, i.e., address n. Thechild zero and one branches for new node 982 and specified by addressesj and p are stored within data structure 983. As a result, sub-treeportion 902 is transformed into sub-tree portion 902' containing newnodes 982 and 987. Hence, rather than merely linking node 935 (pivot bitthree) directly to branch node 940 (pivot bit seven) as in sub-treeportion 902, this linkage, in sub-tree portion 902' now extends throughnew node 982, as shown by the arrows therein.

d. Removal

Essentially, removal involves deleting desired nodal structure,specifically desired leaf nodes and associated branch nodes that provideunique paths to these leaf nodes, from the sub-tree portion andmodifying the pointers of remaining affected nodes to exclude thedeleted structure. For every stored value that is to removed from aPatricia tree only one leaf node and one corresponding branch node willbe eliminated.

FIG. 9F depicts a conventional removal opertion perfomed on sub-treeportion 902. Here, illustratively, a leaf node storing value D andassociated internal node 940 (with a pivot bit of 7)--as indicated bydashed X's--are to be deleted from sub-tree portion 902. As shown, priorto deletion, the child zero pointer for node 935, specifically thatstored within data structure 936, points to address j which is thestarting address for data structure 941 for node 940. Node 940, in turn,points through its child one branch to node 945 (having a pivot bit of9) having data structure 946 with a starting address of k. The linkagesamong the data structures for these three nodes is shown by arrows.

To remove a leaf node and its associated branch node--the illustrativestructure to be removed here being indicated by dashed X's, pointers inremaining nodes are modified to appropriately point around the n ode sto be removed followed by deleting the data structures, for these nodes,from memory and finally de-allocating the memory locations previouslyused thereby. In that regard, if illustrative value D is to be removedfrom sub-tree portion 902, then node 940 which provides a unique path tothis leaf node is removed as well. To effectuate this, the child zeropointer in data structure 936, for node 935, is modified to pointdirectly to node 945, i.e., to address k, rather than to address j. Assuch, sub-tree portion 902" results wherein node 935 (pivot bit three)directly links, via child zero branch 937', to branch node 945 (pivotbit nine), specifically address k for data structure 946, as shown bythe arrow in this sub-tree portion. Once this occurs, leaf node D andbranch node 940 are deleted from the tree; the memory locations usedthereby are then appropriately de-allocated and made available forsubsequent use.

For further information on a conventional Patricia tree, the reader isreferred to: D. Morrison, "PATRICIA--Practical Algorithm to RetrieveInformation Coded in Alphanumeric", Journal of the ACM, Issue 15, No. 4,October 1968, pages 514-534; D. Knuth, The Art of Computer Programming,Vol. 3--Sorting and Searching (©1973, Addison-Wesley PublishingCompany), pages 490-499; and, for a well-known extension to aconventional Patricia tree: T. Merrett et al, "Dynamic Patricia",Proceedings of the International Conference on Foundations of DataOrganization, 1985, pages 19-29--all of these references beingincorporated by reference herein.

2. Our Inventive Rhizome

a. Overview

As noted above, conventional Patricia trees do not accommodate wildcardsat all. In sharp contrast, our inventive rhizome does. By doing so, ourinvention achieves retrieval efficiencies typically associated withconventional Patricia trees and with, though as not heretofore providedby the art, wildcard capability which advantageously and collectivelyrenders our inventive rhizome particularly well suited for use in, e.g.,a packet classifier. Through use of such a classifier, network packetscan be classified very rapidly. Furthermore, not only can our inventivestructure accommodate wildcards, but also this structure can retrieve,for an input key, stored wildcard-based pattern values in an increasingorder of generality, i.e., for such stored values that arehierarchically related the most specific such stored pattern value willbe returned first, as output, followed by each of those pattern valuessituated at a correspondingly increasing level of generality. As notedabove, such a strategy is required of a packet classifier inasmuch astwo classification patterns can be identical to each other or one suchpattern can be subsumed within the other.

Essentially, through our inventive teachings, we have appropriatelymodified a Patricia tree, both in terms of its structure as well as theretrieval, insertion and deletion operations used in conjunctiontherewith, to provide a trie-indexed forest that accommodateshierarchically related wildcard-based stored values. Our inventiverhizome contains two interconnected data structures: a binary searchtrie (being a "search structure" once stored in memory or and/on areadable storage medium) and a hierarchy forest (being a "hierarchystructure" once stored in memory and/or on a readable storagemedium)--the latter required to accommodate wildcard-based patterns andbeing totally absent from a conventional Patricia tree. Inasmuch as ourinventive rhizome is particularly well suited for use in packetclassification, for ease of understanding, we will discuss our inventiverhizome in the context of use with packet classification, for use in,e.g., scheduling, in which packet classification patterns, that containwildcards, are stored in the structure.

In particular, FIG. 10 depicts illustrative rhizome 1000 which embodiesthe teachings of our present invention. As shown, rhizome 1000, whichfor enhanced readability is drawn upside-down with a root at the bottom,is organized into binary search trie 1000₁ (comprising branch nodes andchild links), which feeds, rather than just individual leaves as in aconventional Patricia tree, hierarchically-arranged wildcard-basedpatterns in hierarchy forest 1000₂ (comprising pattern nodes andhierarchical links). A subset of nodes and their interconnecting linksin our inventive rhizome can be regarded as a sub-structure, of whichtwo are illustratively shown as 1002 and 1004. Each pattern node, suchas node 1031, is indicated by a rectangle with a corresponding storedpattern associated with that node being indicated therein; each branchnode, such as node 1010, is indicated by a circle with the value of thepivot bit for that node being specified therein.

Rhizome 1000 contains a forest of ten hierarchy trees. Generallyspeaking, a forest contains one or more trees. One such tree comprisespattern node 1045 at a root, pattern node 1041, pattern node 1031, andpattern node 1033. A second such tree comprises pattern node 1068 at aroot and pattern node 1062. The remaining eight hierarchy trees aredegenerate, each containing a single pattern node, those pattern nodesbeing 1018, 1065, 1072, 1078, 1082, 1088, 1092 and 1094. The thick linesin this figure, that connect pairs of individual pattern nodes, indicatea hierarchical relationship between the patterns stored within thosenodes. In particular, for the hierarchy tree rooted at pattern node1045, the lowest-level hierarchically related patterns stored therein,i.e., the most specific, are 00101100XX001011 and 0010110011010011stored at level 1030 in pattern nodes 1031 and 1033, respectively. Atthe next higher level, i.e., level 1040, pattern 00101100XX0XXX11, byvirtue of its wildcards, stored within pattern node 1041 subsumespatterns 00101100XX001011 and 0010110011010011. Hence, this hierarchicalrelation between node 1041 and each of nodes 1031 and 1033 is indicatedby darkened lines 1037 and 1039, respectively. A similar hierarchicalrelationship exists between the patterns, i.e., 0011010001110100, and00X101X00X110100, stored within nodes 1062 and 1068, and is so indicatedby a darkened line as well. Furthermore, given the patterns in node 1041in view of the pattern, i.e., 0010XX00XXXXXX11, stored within node 1045,the latter pattern subsumes the former pattern. Hence, a hierarchicalrelationship also exists between the latter node and the former node,and is so indicated by again a darkened line. Inasmuch as rhizome 1000does not store any patterns that are increasingly more general, e.g.,0010XXXXXXXXXX11, than that in node 1045, node 1045, being the mostgeneral god pattern in this hierarchy, occupies the highest level inthis particular hierarchy tree.

For the hierarchy tree rooted at pattern node 1068, the most specificpattern is 0011010001110100 in node 1062. At a higher level, pattern00X111X00X110100, by virtue of its increased generality, is stored innode 1068.

To generalize the relationships in a hierarchy forest, FIGS. 11A and 11Bdiagrammatically depict, through a Venn diagram, permitted hierarchicalrelationships among any pair of patterns stored in a rhizome, andspecifically within hierarchy forest 1000₂ shown in FIG. 10. To fullyappreciate this figure, the reader should simultaneously refer to FIGS.10, 11A and 11B throughout the following discussion.

For any two stored patterns, apart from a trivial case of theiridenticality (which is not shown and will result in only one storedpattern in our inventive rhizome), two stored patterns can either beunique or subsumed one by the other, as represented by depictions 1110and 1120, respectively, shown in FIGS. 11A and 11B, respectively.

In particular, if both patterns are unique, i.e., one is no more generalthan the other and neither subsumes the other, as is the case forillustrative patterns 00101100XX001011 and 0010110011010011, storedwithin nodes 1031 and 1033 at level 1030, then, as represented bycorresponding circles 1113 and 1117 in depiction 1100, these patternsare stored independently of each other. The nodes may be indirectlyinterconnected, as nodes 1031 and 1033 happen to be by node 1041, orthey may be completely separate, as is the case for most of theunrelated patterns in structure 1000.

Alternatively, if one pattern is subsumed within the other, as is thecase for illustrative patterns 0010110011010011 and 00101100XX0XXX11stored at nodes 1033 and 1041, then, as represented by circles 1127 and1123 in depiction 1120, the more general pattern, here 00101100XX0XXX11,completely contains the more specific pattern, here 0010110011010011,and is stored at a higher hierarchical level, here level 1040, in thehierarchy tree than is the latter pattern, here at level 1030.

In addition to identity, independence, and subsumption, another possiblerelationship exists between two patterns as illustrated by depiction1130 in FIG. 1C. In this depiction, pattern 00XXXX10 defines a set ofsixteen values collectively represented by circle 1133; similarly,pattern 0011XXXX defines a set of sixteen values collectivelyrepresented by circle 1137. There are some values, such as 00000010,that fall within circle 1133 but do not fall within circle 1137; thereare other values, such as 0010000, that fall within circle 1137 but donot fall within circle 1133; and there are still other values, such as00110010, that fall within both circles. Therefore, these two patternsare not independent, but neither does either one of them subsume theother. Patterns that partially overlap in this manner may not be storedin our inventive rhizome, since no defined hierarchical relationshipholds for these patterns. Such patterns are said to "conflict".

Now, returning to FIG. 10, search trie 1000₁ is situated immediatelybelow and is connected to hierarchy forest 1000₂. The former implementsa binary search trie which provides a unique path to each of thosepatterns situated at the lowest level hierarchy forest, and possibly tothose at higher levels. This trie, often unbalanced as is the case here,is formed of branch nodes, each of which, as noted above, is indicatedby a circle and contains a value of the corresponding pivot bit, i.e.,the bit index. Also, for enhanced readability, zero and one branchesproject to the left and right, respectively, from each branch node. Asshown, search trie 1000₁ contains branch nodes 1010, 1015, 1020, 1050,1055 and 1060 all located on a left side of root node 1005; and branchnodes 1070, 1075, 1080, 1085 and 1090 all situated on a right side ofroot node 1005.

Generally speaking, the organization of the branches in the search trie,such as search trie 1000₁, and a corresponding pivot bit value at eachbranch node therein are both governed, in a similar fashion as with aconventional Patricia tree, by bit locations at which bits, in thepatterns stored in the hierarchy forest, disagree. A unique search pathis defined through the search trie to each pattern stored at the lowestlevel, and possibly to those at higher levels, of the hierarchy forest.As discussed above, the ordering in the hierarchy forest, such as forest1000₂, is governed by whatever order of increasing generality, if anyand by virtue of wildcards, exists among pairs of the stored patterns.

As one can see, by virtue of accommodating hierarchically relatedpatterns, our inventive modified Patricia tree permits multiple nodes,either branch or pattern nodes and here illustratively branch nodes 1015and 1050, to point to a common leaf, here illustratively pattern node1018. In contrast, a conventional Patricia does not permit thiscondition inasmuch as only one node therein can point to any one leaf.However, as with a conventional Patricia tree, our inventive rhizomedoes not permit one-way branching.

b. Retrieval

As with a conventional Patricia tree, a search of our inventive rhizomeis conducted on the basis of an input search key. In a conventionalPatricia tree as discussed above, either the search key will match avalue found by tracing through the tree or that key will not have amatching value anywhere in the tree. With our inventive rhizome, thesearch key will match either a specific pattern stored in the rhizome ora increasingly general pattern stored above it in a hierarchical chain,or that key will not have a matching pattern stored anywhere in therhizome. Hence, the binary search trie effectively narrows down a searchfrom an entire database of stored patterns to just a linear chain ofpatterns in order of strict hierarchy, the latter being usuallysubstantially smaller in size than the former. A search succeeds if theretrieval process locates a stored pattern that most specificallymatches, either identically or in terms of subsuming, the search key.

In particular, to determine whether an input search key exists inrhizome 1000, search trie 1000₁ is searched by first tracing throughthis trie and specifically branching at each node based on the value ofa bit in a search key. Once the search reaches a lowest-level patternnode, the pattern stored at that node is tested against the search key.Inasmuch as the most specific pattern is to be returned for any searchkey, then if this lowest-level pattern does not match the search key,testing for a match continues, through hierarchy forest 1000₂, with thenext higher pattern node in succession until all pattern nodes in apattern hierarchy chain, that includes and extends from the lowest-levelpattern node, have been tested. As soon as a pattern is found in thechain that subsumes the search key, the search is successful and thereference value associated with that particular pattern is returned forsubsequent use. In the event no pattern in the hierarchy chain subsumesthe search key within it or no such pattern hierarchy chain exists, thenthe search does not succeed and an appropriate indication therefor isprovided by a retrieval routine.

To fully appreciate how a stored wildcard-based pattern is retrieved,let us discuss retrieval in the context of a few examples for rhizome1000. First, consider the search key 0010110010001011. Searching beginsat root node 1005, which has a pivot bit (bit index) of zero. Since thevalue of bit zero for this search key is zero, the search path traversesa zero branch, emanating from this node, to node 1010. This latter nodehas a pivot bit of three which for this particular key necessitates thatthe zero branch emanating from this node is traversed to node 1015. Node1015 carries a pivot bit of seven, which for key 0010110010001011,identifies bit position seven therein here being a zero, thereby causingthe search to traverse the zero path emanating from branch node 1015 tobranch node 1020. This latter branch node carries a pivot bit of eleven,which, designates bit eleven in the key. Inasmuch as bit eleven of theexemplary key contains a zero, a zero branch emanating from node 1020 isthen traversed to pattern node 1031. This pattern node contains thestored pattern 00101100XX001011. Since, owing to the wildcards withinthis pattern, the search key is subsumed within the pattern, the searchis successful and the reference value associated with this particularpattern node is returned. No further matching occurs, inasmuch as anyother higher-level pattern nodes, such as 00101100XX0XXX11 and0010XX00XXXXXX11 in nodes 1041 and 1045, respectively, that subsumes thesearch key, owing to strict hierarchy in the rhizome, will be moregeneral than 00101100XX001011.

As a further example, consider the search key 1001111101010101.Searching begins at root node 1005, which has a pivot bit (bit index) ofzero. Since the value of bit zero of this particular search key is one,the search path traverses a one branch, emanating from this node, tonode 1070. This latter node has a pivot bit of two which for this keynecessitates that the zero branch emanating from this node is traversedto branch node 1075. Given this key, the search traverses, as a resultof indexed bit positions in the key, through a one path emanating fromnode 1075 to node 1080, from a one path emanating from node 1080 to node1085, from a zero path emanating from node 1085 to node 1090, andfinally from a zero path from node 1090 to pattern node 1092. Thispattern node contains the stored pattern 1X0XX01101010X01. This pattern,even with its various wildcards, does not match the search key of1001111101010101 inasmuch as a bit disagreement occurs in bit positionfive (the search key having a one at this bit position; the storedpattern containing a zero thereat). Since node 1092 has no patternhierarchy chain, the search is unsuccessful.

Lastly, consider the search key 0010110011001111. Beginning at root node1005, the search path then traverses through nodes 1010, 1015, and 1020and finally reaches pattern node 1031 which stores pattern00101100XX001011. Though this particular pattern does not match thekey--owing to a bit disagreement in bit position thirteen (here, a onein the key, a zero in the stored pattern), a pattern hierarchy existsabove node 1031 in which patterns may be stored, specifically thosewithin nodes 1041 and 1045, that might be sufficiently general toencompass this key. Consequently, the search successively traverses ahierarchical path emanating upward from node 1031 to locate the leastgeneral (most specific) pattern, if any exist, that subsumes the key.Pattern node 1041, which contains pattern 00101100XX0XXX11 that doessubsume the key, is encountered. Hence, the search is successful withthe reference value for node 1041 being returned for subsequent use. If,alternatively, the search key was not subsumed by pattern node 1041 butwas subsumed by node 1045, then the search would still have succeeded;however, the reference field for node 1045 would have been returnedinstead of that for node 1041.

As one can appreciate, the search trie tends to drastically reduce thenumber of patterns that need to be examined to locate one that mostspecifically matches a given input search key. Moreover, owing to thestrict hierarchy in which patterns at higher levels in a given hierarchyare strictly more general and hence completely subsume those at lowerlevels therein, then, if a bit-by-bit comparison is made of successivepatterns in a pattern hierarchy chain against a search key, then asincreasingly general patterns are encountered in that chain, there is noneed to compare a bit in any bit position of the search key that haspreviously been compared against and found to match that in any lowerlevel pattern in that chain. In particular, with the last example givenabove, bit position zero in the search key was found to match bitposition zero in pattern 00101100XX001011 stored in node 1031. Hence,when a next higher pattern, i.e., that being 00101100XX0XXX11 in node1041, is encountered, there is no need to recheck bit zero for thepattern stored in 1041 as that bit was already found to match, viapattern node 1031, that in the search key. In view of the stricthierarchy, re-checking bit positions that have already been found tomatch are redundant and hence can be advantageously eliminated to reduceretrieval time. Generally, the maximum number of bit comparisons thatneeds to be made to retrieve a given search key equals the number ofbits in the key itself plus the number of patterns, in the longesthierarchy chain in the hierarchy forest, less one. In practice, furtherefficiencies might be gained by comparing bits in groups, such as bytesor nibbles, rather than singly since available processors cansimultaneously compare certain-sized groups of bits just as fast as theycan compare single bits.

c. Insertion

For insertion, the presence of a wildcard(s) in a pattern can causecomplications. First, inserting a new pattern containing a wildcard(s)into our inventive rhizome is not as simple, as with a conventionalPatricia tree, as inserting a new branch node, a new pattern node and anew path--as previously discussed in the context of FIG. 9E. A singlewildcard in a pattern permits a corresponding pattern node to be reachedby two different paths. Hence, insertion for our inventive rhizome,based on the bit position of the wildcard(s) in the new pattern, mayrequire inserting multiple branch nodes, some with identical pivot bits,and multiple paths therefrom to reach the new corresponding patternnode. Second, if a pattern is to be inserted into our inventive rhizomethat already contains one or more pattern nodes that each carries awildcard(s), then existing portions of the rhizome may need to be fullyor partially replicated to assure that existing pattern nodes can bereached through multiple paths, specifically via both sides of a newbranch node that will be created for that new pattern node.

To sufficiently elucidate, for our inventive modified Patricia tree,insertion and, as will be shortly discussed below, removal operations,we will digress slightly to first describe, prior to discussinginsertion, the data stored in memory at each node in our inventiverhizome--inasmuch as this data is used and appropriately modified duringeach of these operations. As will be seen, this data includes additionalfields for handling these complications.

FIG. 13 depicts this data as collectively data structure 1300, arranged,for ease of understanding, in columnar form. This data structure, which,when stored in memory and/or on a computer readable medium, forms a"memory structure", encompasses all the fields for a conventionalPatricia tree, as shown in structure 970 in FIG. 9C but with additionalfields for both a branch node and a pattern node in order to accommodatewildcards and accompanying pattern hierarchy.

In particular, data structure 1300 shown in FIG. 13 contains data 1310which is stored at a branch node, and data 1350 which is stored at eachpattern node. Data for VALUE, MASK and IMASK fields 1315, 1320 and 1325,are stored for each node, regardless of whether that node is a branch orpattern node. The MASK and IMASK fields, which will shortly be definedand discussed in considerable detail below, are necessary to specify awildcard(s) at a given pattern node and to propagate the existence ofthat wildcard(s), at that node, downward into underlying structure ofthe rhizome. The reason, as will be described in detail below,supporting the need for such propagation is to provide a mechanism ateach branch node that indicates whether a section of the structure,starting at that node and extending upward, needs to be replicated(i.e., to provide alternate search paths) during insertion. Thispropagated information is also necessary, as will be discussed below,for determining, during removal, just how much structure in the rhizomeis redundant, in view of a pattern to be removed.

Data specific to a branch node encompasses a pivot bit stored in field1330, and pointers to child zero and one, of that branch node, stored infields 1335 and 1340, respectively. Inasmuch as this particular data isused in exactly the same manner as in a conventional Patricia tree andhas been discussed above, we will not dwell on that data any further.

Data specific to a pattern node encompasses a reference value and apointer to a godparent node, as stored in respective fields 1360 and1365. Reference field 1360 can store any arbitrary data and is used inmuch the same manner as the reference field in a conventional Patriciatree, i.e., the value of that field is merely returned for subsequentuse in the event of a successful search terminating at that node. Foruse in a pattern classifier, the reference field usually stores a classdesignation for a packet, which for packet scheduling is a designationof a transmission queue. Generally speaking, whenever a new pattern isto be inserted into the rhizome, then a given unique designation isassigned to that pattern and stored in the reference field associatedtherewith. For any pattern node, field 1365 stores a pointer (address)to a "godparent" node. A godparent for a given pattern node is definedas a next higher-level hierarchically related pattern node. Inparticular, for a hierarchy chain of pattern nodes, the godparent of agiven pattern node in that chain is the next higher pattern node in thatchain that is more general that the given pattern node. The pattern nodesituated at the top of the hierarchy chain has no godparent. If nogodparent exists for a pattern node, then a pre-defined null value isstored in godparent field 1365 for that pattern node. Further, by way ofdefinition, for any pattern node that is a godparent, the node situatedimmediately below it in a hierarchical pattern chain and containing thenext more specific pattern, is a "godchild" of that pattern node.

As with a conventional Patricia tree, pivot bit field 1330 stores thepivot bit, i.e., bit index, for a branch node associated therewith. Now,apart from this function, the data stored in pivot bit field 1330 alsoidentifies the type of node, i.e., branch or pattern, for whichstructure 1300 stores data--this being necessary since, as noted above,the same data structure is used to store data for each of these twodifferent nodes. In particular, for a branch node, the pivot bit fieldstores a bit index which has a value between zero and one less than thetotal number of bits in a stored pattern, inclusive. If a value storedin the pivot bit field equals the length of a stored pattern (all thepatterns in a rhizome have the same length), then this indicatesstructure 1300 is storing data for a pattern node. In particular, for arhizome containing eight-bit patterns, the pivot bit will store valuesfrom zero to seven, inclusive, for a branch node, but will be set equalto eight for each pattern node; and so forth for rhizomes storingpatterns of other sizes.

Now, as to the VALUE, MASK and IMASK (being an abbreviation for an"inverse mask") fields, the purpose of each of these fields differsbetween a pattern node and a branch node. To facilitate understanding,we will discuss each of these fields, first in terms of its use in apattern node and then in terms of its use in a branch node.

Simply stated, for any pattern node, VALUE field 1315 stores the actualbinary pattern therefor.

With respect to the MASK field used in any pattern node, this fieldspecifies the existence of a wildcard in the corresponding pattern(stored in the associated VALUE field). In that regard, if a given bitin the MASK field is a zero, then the corresponding bit in a pattern isa wildcard (denoted by an "X" in the patterns shown, for example, inFIG. 10); if the given bit in the MASK field is a one, then thecorresponding bit of the pattern equals the corresponding bit in theVALUE field. Thus, the pattern 00101100XX101011 (stored in pattern node1031 shown in FIG. 10) is represented by a MASK field of1111111100111111 and a VALUE field 0010110000001011. The two adjacentbits in the eighth and ninth bit positions in this pattern, given thewildcards situated there, are irrelevant and could equally be 00, 01, 01or 11. However, to simplify the logic and testing used in our inventiveinsertion routine for our inventive rhizome, as discussed in detailbelow in conjunction with FIGS. 15, 16A and 16B, we set any bit positionin the VALUE field, at which a wildcard exists at that bit position inthe pattern, to zero. For a shorthand notation, though each of thepattern nodes situated in rhizome 1000 shown in FIG. 10 is formed of acombination of the VALUE and MASK fields, for this and other figures, awildcard at any bit position in the VALUE field (identified by acorresponding zero bit in MASK field) is set to "X" to denote thatwildcard. As shown in FIG. 2B and discussed above in conjunctiontherewith, a pattern, for use in packet scheduling, such as pattern 250,is formed by consistently concatenating the classification fields; aMASK field therefor, such as mask field 255, is formed by consistentlyconcatenating the corresponding mask sub-fields for each of theindividual classification fields.

As another example, consider the pattern 00101100XX0XXX11 stored inpattern node 1041 shown in FIG. 10, this pattern is represented by aVALUE field of 0010110000000011 and a MASK field of 1111111100100011.Likewise, consider the pattern 0010XX00XXXXXX11 stored in pattern node1045. This pattern is represented by a VALUE field of 0010000000000011and a MASK field of 1111001100000011.

For each pattern node, the IMASK field of that node is set equal to theIMASK field of its godparent, if it exists, of that node. For anypattern node that does not have a godparent, the IMASK field of thatparticular pattern node is set equal to its own MASK field. Inasmuch aspattern hierarchies in any hierarchy chain follow strict hierarchicalrules as the chain is incrementally traversed upward to successivelymore general patterns, then, setting the IMASK field of a pattern nodeto that of its godparent is equivalent to stating that the IMASK fieldof that pattern node is equal to a conjunction (i.e., logical ANDcombination) of the MASK field of that node and the IMASK field of itsgodparent. This occurs because, as a result of the strict hierarchy in apattern chain, the MASK field of the godparent is guaranteed not to haveany bits set therein that are not set in the MASK field of the patternnode immediately below it (i.e., a godchild) in the chain.

Furthermore, in each instance where hierarchy is irrelevant ornonexistent, such as for a hierarchy chain that contains just onepattern node, then the MASK and IMASK fields of that pattern node areset equal to each other.

As noted above, the MASK, IMASK and VALUE fields serve a differentpurpose for any branch node than for any pattern node. Specifically, thefields of any given branch node specify, in condensed form, informationabout the fields of all patterns reachable from that branch node. Thereare five possible states maintained for each bit position: (a) alldescendent patterns require that the bit be zero; (b) all descendentpatterns require that the bit be one: (c) some but not all descendentpatterns require that the bit be zero; (d) some but not all descendentpatterns require that the bit be one; and (e) no descendent patternscare about the value of the bit. This information is necessary, asstated above, during: (a) insertion, for determining just how whatportion of the rhizome needs to be replicated in order to insert a newpattern, and (b) removal, for determining, just how much structure inthe rhizome is redundant, in view of a pattern to be removed.

In particular, the MASK field for any branch node is set to thedisjunction (logical OR combination) of the MASK fields of its two childnodes, up to the bit position specified by the pivot bit of this branchnode. Since this is a recursive definition extending upward through boththe search and hierarchy structures, a bit value of one in a MASK fieldof a branch node indicates that at least one pattern node, reachablefrom (i.e., a lineal descendant of) that branch node, cares about thevalue of that bit. If a bit position in a MASK field of a branch node isa zero, then this means that no descendent pattern, i.e., any of thosein a pattern hierarchy chain reachable from this branch node, caresabout this bit position; hence, the value at this bit position isirrelevant to all those patterns.

The IMASK field for any branch node is set to the conjunction (logicalAND combination) of the IMASK fields of its two child nodes, up to thebit position specified by the pivot bit of this branch node. Since thisis also a recursive definition extending upward through both the searchand hierarchy structures, a bit value of zero in an IMASK field of abranch node indicates that at least one pattern node, descending fromthat branch node, does not care about the value of that bit.

The VALUE field for any branch node is set to the disjunction of theVALUE fields of its two child nodes, up to the bit position specified bythe pivot bit of this branch node. Since this is also a recursivedefinition extending upward through both the search and hierarchystructures, a value in a bit position of the VALUE field of a branchnode indicates, if at least one pattern node in a pattern hierarchychain containing this branch node cares about the value of thisparticular bit, just what that particular bit value is.

As stated above, a zero is set in the VALUE field of each pattern nodeat a bit position when a wildcard exists in that pattern. Should a onebe used here instead of a zero, then determining a branch VALUE fieldfor any branch node would require performing a conjunction operationinstead of a disjunction operation.

Given the existence of pattern-based wildcards in our inventive rhizome,our inventive insertion process maintains the following attributes:

(a) each branch node can only have exactly two child nodes (children),each of which may be either another branch node or a pattern node;

(b) for any branch node and any descendent pattern stored within thestructure, if the bit of that pattern indexed by the pivot bit of thisbranch node is a zero, then that pattern must be reachable via a childzero branch emanating from that branch node and not via a child onebranch emanating from that branch node;

(c) for any branch node and any descendent pattern stored within thestructure, if the bit of that pattern indexed by the pivot bit of thisbranch node is a one, then that pattern must be reachable via the childone branch emanating from that branch node and not via the child zerobranch emanating from that branch node;

(d) for any branch node and any descendent pattern stored in thestructure, if the bit of that pattern indexed by the pivot bit of thisbranch node is a wildcard, then this pattern must be reachable via bothchild one and zero branches emanating from that branch node;

(e) for any two patterns stored in the rhizome, if neither pattern ismore general than the other, then there must exist a branch node thatmeets both of the following two conditions:

(1) the pivot bit index matches a first bit index at which these twopatterns disagree; and

(2) this branch node is reached by following a path, from the root nodeof the structure, taken by branching on all lower-indexed matching bitsof the two patterns; and

(f) for any two patterns stored in the structure, specifically withinthe hierarchy forest therein, if one of these patterns is more generalthan the other pattern, then the first pattern is a god-ancestor(meaning a godparent or a god-ancestor of a godparent, recursively) ofthe other pattern.

To further enhance understanding, we will now turn to FIGS. 12A and 12B.These figures diagrammatically depict an insertion operation in ourinventive rhizome and the resulting impact on the VALUE, MASK and IMASKfields. For ease of understanding, the reader should simultaneouslyrefer to both of these figures during the course of the ensuingdiscussion.

Specifically and illustratively, a new pattern 100X1011XX111X00 is to beinserted into rhizome 1000 shown in FIG. 10. To simplify the drawings,FIG. 12A depicts a vertical chain in this rhizome that will be affectedby this insertion. As can be seen from comparing this new pattern withthose already stored in the rhizome, the lowest bit position at which abit disagreement occurs is in bit position ten. Hence, a new branchnode, with a pivot bit of ten, specifically node 1210, will be added tosub-structure 1002 shown in FIG. 12A. This new pattern will be stored ina new pattern node, specifically pattern node 1220. To provide properlinks to this new pattern node, an existing link from a child one branchemanating from node 1080 to node 1085, this link indicated by a dottedline in FIG. 12B, will be removed and replaced by a link to new branchnode 1210. The child zero branch of branch node 1210 will link to node1085; the child one branch of new branch node 1210 will link to newpattern node 1220, thereby forming sub-structure, here vertical chain,1002'. The new links inserted into sub-structure 1002 to accommodate thenew pattern are shown by solid double lines. In addition, during nodeinsertion, the contents of the VALUE, MASK and IMASK fields for thosenodes in the chain below new branch node 1210 are all updatedaccordingly, through disjunction or conjunction as discussed above, toreflect the corresponding values for the new pattern node, particularlythe wildcards therein. In that regard, to the right of each of nodes1070, 1075, 1080, 1210, 1085, 1090, 1092 and 1220 in FIG. 12B arecorresponding tables 1070_(t), 1075_(t), 1080_(t), 1210_(t), 1085_(t),1090_(t), 1092_(t) and 1220_(t). Each of these tables specifies thecontents of the VALUE, MASK and IMASK fields of that corresponding nodeboth without the new node (status being "without"), i.e., forsub-structure 1002 in FIG. 12A, and with the new node having beeninserted into the rhizome (status being "with"), i.e., for sub-structure1002' in FIG. 12B.

Though the child one pointer would also be updated accordingly for node1080, for simplification, all fields other than VALUE, MASK and IMASKfor each of the nodes in sub-structure 1002' have been omitted from FIG.12B. Based on the description, as given above, of each of these otherfields, it will be quite apparent to those skilled in the art how thecontents of those fields would be set or modified, as required, toaccommodate insertion (as well as for removal as will be describedimmediately below).

d. Removal

Essentially, removal involves deleting desired nodal structure,specifically desired leaf nodes and associated branch nodes that providepaths to these leaf nodes, and modifying the VALUE, MASK and IMASKfields and associated child pointers of remaining affected nodes in therhizome to eliminate the deleted structure. For every stored patternthat is to removed from our inventive rhizome, one or more branch nodes,depending upon any resulting redundancy in the structure occasioned bythe removal, will be eliminated. Whether a branch node is to be removedor not is determined by whether the pattern being removed cares aboutthe pivot bit of that branch node. If it cares, but no otherhierarchically related pattern so cares, then that branch node isremoved. Owing to wildcards, removal of a pattern node can engenderremoving a significant part of the search trie--not just one branch nodeand an associated path as shown in FIG. 9F for a conventional Patriciatree.

To illustrate removal consider FIG. 12C which depicts sub-structure1002'. Pattern node 1220, which has been inserted into sub-structure1002 shown in FIG. 12A to produce sub-structure 1002' shown in FIGS. 12Band 12C, will now be removed. Since this node is the only node thatpresents a bit disagreement at bit position ten, that node along withassociated branch node 1210 are both removed. To do so, these nodes aresimply deleted and the values of fields VALUE, MASK and IMASK areupdated accordingly. The values for these fields, prior to removal, areindicated in the tables shown in FIG. 12B and specifically those underthe "with" status designation. With the removal of these nodes, theresulting updated fields will contain the values shown in these tables,specifically those under the "without" designation.

e. Replication and elimination of rhizome structure

1. structural replication upon insertion

As stated above, insertion of a new pattern node into our inventiverhizome may require a portion of the existing rhizome to be replicated.If a pattern node having a new pattern is to be inserted into therhizome that already contains one or more pattern nodes that eachcarries a wildcard(s), then existing portions of the rhizome may need tobe fully or partially replicated to assure that the extant pattern nodescan be reached through multiple paths, specifically via both sides of anew branch node that will be created for that new pattern node.

FIGS. 12D, 12E and 12F collectively depict an example of insertion thatnecessitates replication. Here, FIG. 12D depicts sub-structure portion1004 of rhizome 1000 shown in FIG. 10. The nodes shown in FIG. 12D arere-arranged in FIG. 12E to provide room in the figure for newly addedstructure, to be shown in FIG. 12E. Here, a new pattern, specifically0011110010011110, is to be inserted into sub-structure portion 1004.Once inserted, sub-structure portion 1004' depicted in FIG. 12F results.The new pattern, stored in pattern node 1230, causes the creation of newbranch node 1240 with a pivot bit of four. This occurs inasmuch as thefirst bit at which a disagreement now occurs, between any existingpattern stored in the structure and the new pattern, is at bit positionfour. Since existing patterns 0011010001110100 and 00X101X00X110100 thatdisagree with the new pattern have a zero in bit position four, theexisting sub-structure for these particular patterns must reside belowchild zero of the new branch node. Moreover, since two of the patternsreachable through new branch node 1240, specifically 00XXX00111010010and 0011X10000110000, have a wildcard in bit position four, thesepatterns must be reachable from both sides of this new branch node.Therefore, structure leading to the pattern nodes for these two patternsmust be replicated below child one of the new branch node. Inasmuch asthe first bit at which these two particular patterns disagree amongstthemselves occurs at bit position five, then branch node 1050 isreplicated as branch node 1245 to distinguish between these twopatterns. Finally, a new branch node 1250 with a pivot bit of eight iscreated to completely distinguish the newly inserted pattern from thosealready in the structure. The replicated structure, here pattern nodes1018 and 1065 contained in dashed blocks, is achieved by additionalchild zero paths 1262 and 1266 emanating from new branch nodes 1245 and1250, respectively. To greatly simplify FIGS. 12D-12F, the values of theVALUE, MASK and IMASK fields have been intentionally omitted, for eachof the nodes, therefrom; nevertheless, these values can be easilydetermined through the specific manner set forth above.

2. structural elimination upon removal

As noted above, owing to the presence of wildcard(s) in a storedpattern(s), removal of a pattern node, whether itself contains awildcard or not, can engender removing a significant part of the searchtrie--not just one branch node and an associated path as is the casewith a conventional Patricia tree.

Similar to our prior description of FIGS. 12C and 12B in the case ofremoval, consider FIG. 12F and 12E which depict sub-structures 1004' and1004, respectively. Pattern node 1230, containing pattern0011110010011110, which has been inserted into sub-structure 1004 shownin FIG. 12E to produce sub-structure 1004' shown in FIG. 12F, will nowbe removed from sub-structure 1004'. Here, the branch node, i.e., branchnode 1250 with a pivot bit of eight, that is a parent to this particularpattern node (node 1230), would be left with only one child branch uponremoval of this pattern node and thus should be removed. Consequently,once this branch node is removed, pattern node 1065 containing pattern0011010000110000 would extend from a child one branch of branch node1245 which has a pivot bit of five. Since at this point, none of thepatterns descending from branch node 1240, having a pivot bit of four,would care about the value of the bit in bit position four, this branchnode can be removed as well, with its child one branch also beingeliminated. Consequently, the sub-structure that results from removingpattern node 1230 is sub-structure 1004 depicted in FIG. 12E.

D. Software Routines for Use with Our Inventive Rhizome for PacketClassification

There may exist applications for which it is necessary to store patternscontaining wildcards in a database, yet for which no need forhierarchical pattern retrieval is necessary. In such cases, virtuallyall of the structure thus far described is still required, with theexception of the godparent pointer 1365 in nodal data structure 1300,which can be omitted. However, the routines which implement retrieval,insertion, and removal can be simplified significantly for suchapplications. Therefore, the present section will describe twoembodiments of the invention, the hierarchical rhizome and thenon-hierarchical rhizome, and the routines will similarly be known ashierarchical retrieval and non-hierarchical retrieval, and so forth. Tosimplify reader understanding, the description refers to thehierarchical rhizome, in keeping with the description thus farpresented. Where appropriate, the ensuing description will specificallydelineate those portions of the routines which can be deleted for thenon-hierarchical embodiment.

Of all the operations, i.e., retrieval, insertion and deletion, to beperformed on our inventive rhizome, retrieval is expected to be the mostfrequently used, and in that regard, substantially more often than willeither insertion or deletion. In the context of a packet scheduler, thiswill occur because each and every packet destined for network transportwill need to be separately classified to determine its correspondingtransmission queue. In that regard, for each such packet, itsclassification fields, i.e., its source-destination addresses andsource-destination port designations, will be used in concatenated form,as a search key, to retrieve an associated queue designation from ourinventive rhizome. Insertion and deletion, on the other hand, will beused merely to establish or remove queue designations from the rhizome.Clearly, these two operations will not be used with each and everypacket. Given this, the retrieval operation used with our inventiverhizome is intentionally far simpler and significantly faster than boththe insertion and deletion operations.

1. Retrieval (search) routine

FIG. 14 depicts a flowchart of Retrieval routine 1400 which executes asstep 580 shown in FIG. 5B within pattern classifier 410 shown in FIG. 4.Given an input search key, routine 1400 searches our inventive rhizomein an attempt to locate a stored pattern, in a pattern node, thatmatches or, in its least general form, subsumes the search key, and thenreturn a reference value, i.e., a queue designation for packetscheduling, associated with the resulting pattern.

Upon entry into routine 1400, execution first proceeds to decision block1405, which determines whether the entire rhizome (structure) is empty.If the rhizome is empty, then clearly no match will result; the searchis unsuccessful. In this case, execution exits from the routine, via YESpath 1407 emanating from decision block 1405, and a "no match" messageis returned. Alternatively, if the rhizome is not empty, then executionis directed, via NO path 1409, to block 1410. This latter blockinitializes a pointer to point to the root node of the rhizome,specifically that within the search trie thereof. Execution proceeds toa loop containing blocks 1415 and 1420 which appropriately traversesthrough the search trie based on the bits of the search key and thepivot bits of the branch nodes in that trie. In particular, executionproceeds to decision block 1415 which tests, through examining the pivotbit field as discussed above in conjunction with FIG. 13, whether thepointer currently points to a pattern node. If the pointer does notcurrently point to a pattern node, but rather to a branch node, thenthis decision block routes execution, via NO path 1417 shown in FIG. 14,to block 1420. This latter block, when executed, examines the bit of thesearch key designated by the pivot bit of this branch node to which thepointer is currently pointing. Thereafter, this block sets the value ofthe pointer to point to a child (child one or zero) of this branch nodeas specified by this bit in the search key. Execution then proceeds, viafeedback path 1422, back to decision block 1415 to test the type of alatest node to which the pointer is currently pointing, and so forthuntil a path is fully traversed through the search trie and a patternnode is first encountered.

When decision block 1415 determines that the pointer is pointing to apattern node, then it routes execution, via YES path 1419, to block1425. This latter block initializes a bit index to zero. Thereafter, aloop is entered containing blocks 1430, 1440, 1450, 1462 and 1466, toperform a bit-by-bit comparison on each pattern encountered includingpatterns in godparent nodes in a hierarchical pattern chain.

In particular, decision block 1430 determines whether the bit index hasa value less than the number of bits in the search key (the length ofwhich is the same as that of all the patterns in the rhizome). If not,then all bits in the search key have been tested, and hence retrieval isconcluded. In this case, execution exits, via NO path 1434 emanatingfrom decision block 1430. At this point, the pointer will have been setto point to a stored pattern that is the most specific match to that ofthe search key. In this case, the search has succeeded; the value of thereference field associated with this stored pattern is returned.

Alternatively, if more bits remain to be tested, i.e., the bit index isless than the number of bits in the search key, then decision block 1430routes execution, via YES path 1432, to decision block 1440. Decisionblock 1440 determines whether the value of the particular bit,designated by the current value of the bit index, in the search keymatches a value situated in the same bit position in the stored patternin the pattern node to which the pointer is currently pointing. If thesebit values match, meaning either that they specify the same value orthat the bit in the pattern is a wildcard, then decision block 1440routes execution, via YES path 1442, to block 1450. This latter block,when executed, increments the bit index to point to the next bitposition. Thereafter, execution loops back, via path 1470, to decisionblock 1430 to test a bit in the next successive bit position in thesearch key against that in the stored pattern, and so forth.

In the non-hierarchical retrieval routine, the blocks within dotted box1460, specifically decision block 1462 and block 1466, can be omitted.If, in this case, as a result of decision block 1440, a bit of thesearch key does not match a bit in the same bit position in the storedpattern for the first and only pattern node encountered, then there isno match and the retrieval operation is unsuccessful. Hence, executionexits from routine 1400, via dashed path 1480, with a "no match" messagebeing returned.

In the hierarchical retrieval routine, the operations in dotted box 1460are performed. Accordingly, once a bit is encountered in the search keythat does not match a bit in the same bit position in the stored patternin the current pattern node, comparison is continued with a godparent ofthat pattern node, should that godparent exist. Hence, if a match, atthe present bit position, between the search key and the stored patternin the current pattern node is not found, then decision block 1440routes execution, via NO path 1444, to decision block 1462. This latterdecision block, when executed, determines whether the current patternnode has a godparent node by testing its pointer to godparent field 1365shown in FIG. 13. As noted above, if a current pattern node does nothave a godparent node, a null value will be stored in this field. If thecurrent pattern node has a godparent, as indicated by this field, thendecision block 1464 shown in FIG. 14, routes execution, via YES path1464, to block 1466. This latter block sets the pointer to point to thisgodparent node. Execution then loops back, via path 1470, to test thesearch key against one or more successive bit positions in the patternstored in this godparent node. As noted above, by virtue of the stricthierarchy inherent in the hierarchy structure, specifically inhierarchical pattern chains therein, there is no need, at any patternnode in the chain, to recheck any bits at bit positions that havealready been tested at lower levels in the chain. Accordingly, the bitindex is not reset prior to looping back, via feedback path 1470, todecision block 1430 in order to match successive bit(s) in the searchkey against bit(s) in the same successive bit position(s) in patternstored in this godparent node.

Alternatively, if the current pattern node does not have a godparentnode, meaning that routine 1400 has now reached the end of ahierarchical chain and a match has not been found between the search keyand stored patterns in all the pattern nodes in that chain, i.e., nopattern stored in a node in that chain identically matches or subsumesthe search key, then the retrieval is unsuccessful. Accordingly,execution exits from routine 1400, via NO path 1468, with a "no match"message being returned.

2. Insertion process

i. Insertion routine 1500

FIG. 15 depicts a flowchart of Insertion routine 1500 which is executedby pattern classifier 410 shown in FIG. 4. This routine, combined withroutine 1600, is used to insert a new pattern node into our inventiverhizome. Apart from routine 1600, which will be discussed shortly belowin conjunction with FIGS. 16A-16B, routine 1500 performs relatively fewoperations.

In particular, routine 1500 is entered with two parameters: VALUE andMASK. Together, as noted above, these parameters collectively specifythe new pattern to be inserted, including its wildcard(s), if any. Uponentry into routine 1500, execution first proceeds to block 1510. Thisblock, when executed, allocates sufficient memory area to accommodate anew pattern node, i.e., sufficient memory locations to store datastructure 1300 (as shown in FIG. 13) for this new node. Once thisallocation has occurred, block 1510 then sets the values of variousfields in this structure appropriately for this new node. In particular,in order to insure that the bits of the VALUE field that correspond towildcard indications in the MASK field are zero, the VALUE field is setto the conjunction of the given VALUE and MASK values supplied, uponentry, to this routine. The MASK and IMASK fields are set equal to theMASK value supplied to this routine. The pivot bit for this new node isset equal to the number of bits in this pattern (thereby, as in themanner stated above, distinguishing this node from a branch node).Thereafter, the pointer to godparent field for this new pattern node isset equal to a null value. Finally, the reference field is thenappropriately set to a particular predefined data value, e.g., a queuedesignation, given for this node. Since the manner through which thevalue for the reference field is determined and assigned is immaterialto the present invention, it will not be described in any furtherdetail.

Once block 1510 completes its execution, execution proceeds to decisionblock 1520. This decision block tests whether the rhizome (structure) isempty. If so, then the new pattern node simply becomes the root of therhizome. In this case, the decision block routes execution, via YES path1527, to block 1540 which sets the root of the search trie to point tothe new pattern node. Thereafter, execution exits from insertion routine1500. Alternatively, if the rhizome is not empty, then decision block1520 routes execution, via NO path 1523, to block 1530. This latterblock executes Node Insert routine 1600 to properly insert the newpattern node into the rhizome. This routine is called with two specificarguments; namely, a pointer to the root node of the rhizome, a bitindex of "-1". Inasmuch as bit positions, as noted above, start with azero value, a value of "-1" specifies that no bits of a search key havebeen examined thus far, i.e., an initial condition. Inasmuch as routine1600 is recursively invoked, as and if needed, to insert and replicatestructure required to handle the new pattern node being inserted, then,once routine 1600 has fully completed its execution, including allrecursive iterations thereof, block 1523 completes its execution.Thereafter, execution simply exits from routine 1500.

ii. Node Insert routine 1600

A flowchart of Node Insert routine 1600 is collectively depicted inFIGS. 16A-16B; the correct alignment of the drawing sheets for thesefigures is shown in FIG. 16. As noted immediately above, routine 1600properly inserts a new pattern node into the rhizome and is calledrecursively, if necessary, to replicate appropriate amounts of rhizomestructure.

Routine 1600 is entered with two parameters: POINTER and PREVIOUS BITEXAMINED. The value of POINTER specifies a current node being examined,which is initially set to point to the root node; PREVIOUS BIT EXAMINED,which is initially set to "-1", specifies a bit position of the newpattern that was last examined by this routine, which, for anynon-negative setting, occurred through an immediately prior recursiveexecution thereof.

We will describe this routine first in a a summary overview fashion thenin detail.

Essentially, routine 1600 first examines all of the bit indices betweenthat of a previous branch node, starting at the root node, and that of acurrent node indicated by the value of POINTER (which hereinafter wewill also refer to, for clarity, as the "pointed-to node"). Ifcorresponding bits of a new pattern and the pointed-to node disagree atany of these bit indices, then this routine inserts a new branch nodeabove the current node. The current node becomes one child of the newbranch; a new pattern node, storing the new pattern, becomes the otherchild. However, if any of the descendants of the pointed-to node do notcare about the value of the pivot bit of the newly inserted branch node,then these descendants must be copied (i.e., replicated) to the otherchild of the new branch node. If any such nodes are so copied, then thisroutine continues with the other child node. Alternatively, if no nodesare copied, then the new pattern node, for the new pattern, can beinserted directly as the other child, after which point this routineterminates.

Now, when all of the bit indices up to that of the pointed-to node havebeen examined, and if the new pattern node was not already inserted,routine 1600 determines whether the pointed-to node is a pattern node ora branch node.

If the pointed-to node is a branch node, then this routine will updatethe fields (specifically VALUE, MASK and IMASK) for the pointed-to node,and will recursively call itself. Specifically, if the new pattern doesnot care about the value of the bit specified by the pivot bit of thepointed-to node--i.e., a wildcard exists in new pattern at this bitposition, then this routine recurses itself on both children of thisnode. Otherwise, if the new pattern does not have a wildcard at the bitposition indexed by the pivot bit of the pointed-to node, then routine1600 only recurses itself on the child determined by the bit at thisposition in the new pattern.

Alternatively, if the pointed-to node is a pattern node, then routine1600 traces through a pattern hierarchy chain updating each successivepattern node encountered. This continues until one of four conditionsoccurs: (a) the new pattern is already found in the chain (because itwas previously inserted via another path) in which case insertion iscomplete, and the routine simply exits; (b) the new pattern is morespecific than that of a pattern found at a then pointed-to node in thechain, in which case the new pattern is inserted before the pointed-tonode and routine 1600 terminates; (c) the new pattern conflicts (asillustrated in depiction 1130 in FIG. 11C) with a pattern found at athen pointed-to (pattern) node in the chain in which case thispointed-to node is removed from the rhizome (if it had been insertedelsewhere therein) and execution of routine 1600 aborts; and (4) thepointed-to (pattern) node does not have a godparent, in which case a newnode, for the new pattern, is inserted above the current pattern node inthe pattern hierarchy chain.

Specifically, upon entry into routine 1600, execution first proceeds toblock 1603. This block, when executed, sets a bit index to equal thePREVIOUS BIT EXAMINED plus one (which in the case of the initialexecution of this routine becomes zero, hence pointing to the first bitin the new pattern). Execution then proceeds to a loop containing blocks1606-1638 to successively examine each bit in the new pattern, againstthe corresponding bits in the fields of the pointed-to node, up to thatspecified by the pivot bit of the pointed-to node in order to locate anybit disagreement therebetween. In particular, decision block 1606 testswhether the bit index is less than the pivot bit for the pointed-tonode.

If the bit index is less than the pivot bit of the pointed-to node, thendecision block 1606 routes execution, via YES path 1607, to decisionblock 1610. The latter decision block determines, by examiningcorresponding bits, whether a bit, specified by the bit index, in thenew pattern and that in the pointed-to node disagree, meaning that onepattern specifies that the bit be zero and the other pattern specifiesthat it be one. If no such disagreement exists, then execution is fedback from decision block 1610, via NO path 1612, to block 1638, which,when executed, increments the bit index by one to point to the next bit.Execution then proceeds to decision block 1606 to test this next bit,and so forth. Alternatively, if such a disagreement exists, thendecision block 1610 routes execution, via YES path 1614, to block 1615.Consequently, a new branch node must now be inserted into the rhizome,specifically the search trie therein, with a pivot bit that specifiesthe bit position where the disagreement exists. To do so, block 1615allocates a new branch node and appropriately sets various fieldsassociated with this node. In particular, the VALUE field is set to thedisjunction (logical OR combination) of the VALUE field of the newpattern and the VALUE field of the pointed-to node. Similarly, the MASKfield is set to the disjunction of the MASK field of the new pattern andthe MASK field of the pointed-to node. Also, the IMASK field is set tothe conjunction (logical AND combination) of the MASK field of the newpattern and the IMASK field of the pointed-to node. By setting thesefields appropriately, based on those of higher-level nodes in therhizome, the existence of wildcards at pattern nodes reachable throughthe new branch node will be propagated to that branch node. In addition,block 1615 sets a specific child pointer of the new branch node to thepointed-to node. The specific child pointer, i.e., the child zero orchild one pointer, is selected based on the indexed bit of the VALUEfield of the pointed-to node. The pivot bit for the new branch node isset equal to the current bit index. Once these operations havecompleted, execution proceeds from block 1615 to decision block 1620.This latter decision block determines whether a wildcard exists, inpatterns reachable by the new branch node, at the bit position specifiedby the current bit index. If no such wildcards exist at this bitposition, i.e., the value of the IMASK bit at this bit position in thepointed-to node is one, then decision block 1620 routes execution, viaYES path 1624, to block 1625. This latter block sets the specific childpointer for the new branch node, as designated by the indexed bit of theVALUE field of the new pattern, to point to the new pattern node. Oncethis occurs, execution then exits from routine 1600. Alternatively, if awildcard exists, in patterns reachable by the new branch node, at thebit position specified by the current bit index, then decision block1620 routes execution, via NO path 1622, to block 1630. As a result ofthe wildcard, portions of the existing rhizome, above the pointed-tonode, must be replicated to properly support the wildcard. Consequently,block 1630, when executed, invokes Replicate routine 1900 (as will bediscussed in detail below in conjunction with FIG. 19) with twoarguments: POINTED-TO NODE and the current INDEXED BIT. The replicareturned by the replicate routine becomes a child of the new branchnode, specifically the child corresponding to the indexed bit of the newpattern. The value of POINTER is set to point to this child. Once block1630 has fully executed, i.e., Replicate routine 1900 has completed allits recursion and POINTER has been finally set, then execution is fedback, via path 1635, to block 1638 to increment the bit index forexamining the next bit position in the new pattern vis-a-vis the nextpivot bit in the rhizome, and so forth.

When the bit index reaches the value of the pivot bit of the pointed-tonode, execution then proceeds, via NO path 1609 emanating from decisionblock 1606, to decision block 1640. This latter decision blockdetermines, based on testing the value of the pivot bit stored for thenew node, whether that node is a pattern node or not. If the pointed-tonode is a branch node, then, having traversed this far up the searchtrie to where the index bit equals the pivot bit of the pointed-to node,then the pattern needs to be inserted somewhere above the pointed-tonode. Therefore, execution proceeds via NO path 1641 to block 1648,where the VALUE, MASK AND IMASK fields in the pointed-to node aremodified accordingly to reflect this ancestry. In particular, the VALUEand MASK fields of the pointed-to node are formed by separatelydisjoining the existing VALUE and MASK fields of this node with theVALUE and MASK fields of the new pattern, respectively. The IMASK fieldof the pointed-to node is formed by conjoining the existing IMASK fieldof this node with the IMASK field of the new pattern. Once these fieldsfor the pointed-to node have been updated, execution proceeds todecision block 1650 to determine whether a wildcard exists in the newpattern at the bit position indexed by the pivot bit of the pointed-tonode. If a wildcard does not exist, as given by a one in the MASK fieldat this bit position in the new pattern, then the new branch node willbe inserted (as well as its supporting structure, with replication ifnecessary) on the child zero path from the pointed-to node. In thiscase, execution proceeds, via YES path 1651 emanating from decisionblock 1650, to block 1658. This latter block recursively invokes thepresent routine, i.e., Node Insert routine 1600, on a child of thepointed-to node, that child being selected by a bit of the new patternindexed by the pivot bit of the pointed-to node. Once recursion fullycompletes and block 1658 has completely executed, execution then exitsfrom routine 1600. Alternatively, if a wildcard exists, as given by azero in the MASK field at this bit position in the new pattern, thennodes will need to be inserted on both sides (i.e., on both childbranches) of the pointed-to node, with appropriate replication ofsupporting rhizome structure to support each such new node.Consequently, execution proceeds, via NO path 1653, emanating fromdecision block 1650, to block 1655. This latter block recursivelyinvokes Node Insert routine 1600 for child one of the pointed-to node.Thereafter, once block 1655 has fully executed, execution next proceedsto block 1658 to appropriately insert nodes, again recursively, on nowthe child zero of the pointed-to node (remembering, as noted above, thata zero is set in a bit position in the VALUE field for a pattern nodethat has a wildcard at that position), after which execution exits fromroutine 1600.

In the non-hierarchical node insertion routine, the blocks within dottedbox 1660, specifically decision blocks 1665, 1670, 1676 and 1685, andblocks 1675, 1683, 1690 and 1694 can all be omitted. If decision block1640 determines that the pointed-to node is a pattern node, then therhizome contains an extant pattern that is not distinguishable from thenew pattern, either because the two patterns are identical or becausethey bear a hierarchical relationship to each other that is notsupported in the non-hierarchical rhizome. Therefore, execution proceedsvia YES path 1643 and dashed path 1645 to block 1680, where the newpattern node is removed and the insertion is aborted.

In the hierarchical node insertion routine, execution proceeds fromdecision block 1640, via YES path 1643, to dotted box 1660 andspecifically to decision block 1665 therein. Decision block 1665determines whether the pointed-to node is the new pattern node. If so,then, by virtue of reaching this point in routine 1600, the new patternnode has already been installed in the rhizome via an alternate paththrough the search trie, due to the presence of a wildcard in the newpattern. In this case, execution exits from routine 1600, via YES path1669 and path 1696. If the pointed-to node is not the new pattern node,then the new pattern node will be inserted at an appropriate site in thepattern hierarchy that is to contain the pointed-to node. In this case,execution proceeds, via NO path 1667 emanating from decision block 1665,to decision block 1670. This latter decision block determines whetherthe new pattern is more specific than the pattern stored in thepointed-to (pattern) node. On the one hand, if the new pattern is sospecific, then the new pattern node, storing the new pattern, isinserted before the pointed-to node in the hierarchy chain.Consequently, decision block 1670 routes execution, via YES path 1673,to block 1675. This latter block, when executed, modifies the IMASKfield of the new pattern node by conjoining its value with the IMASKfield of the pointed-to node and storing the result into the IMASK fieldof the new pattern node. The new pattern node is then inserted into thehierarchy forest and specifically within the hierarchical pattern chainbut before the pointed-to node; hence, the pointed-to node becomes agodparent of the new pattern node. Thereafter, execution exits, via path1696, from routine 1600. If, on the other hand, the new pattern is notmore specific than that stored in the pointed-to (pattern) node, thendecision block 1670 routes execution, via NO path 1671, to decisionblock 1676. This decision block determines whether the new patternconflicts with the pattern stored at the pointed-to node, in the mannerillustrated by depiction 1130 in FIG. 11C. If such a conflict exists,then admitting the new pattern node would result in a rhizome with nodefinable hierarchy; thus, insertion of the pattern is refused.Consequently, execution proceeds, via YES path 1679, to block 1680 toremove this particular new pattern node in favor of the pre-existingpattern node. Execution then exits from routine 1600. Now, if decisionblock 1676 determines that the new pattern and that stored in thepointed-to node do not conflict, then by virtue of the immediately priortest through decision block 1670, the new pattern must be more generalthan that at the pointed-to node, thereby necessitating that a newpattern node, for this new pattern, must be stored in the hierarchychain but somewhere above the pointed-to (pattern) node. In this case,the IMASK field of the pointed-to node needs to be updated to accountfor this new pattern, particularly for any wildcard(s) therein.Accordingly, decision block 1676 routes execution, via NO path 1678, toblock 1683. This latter block modifies the IMASK field Ells of thepointed-to node as being the conjunction of the existing IMASK field ofthis node with the IMASK field of the new pattern node. Thereafter,execution proceeds to decision block 1685. This decision blockdetermines whether the pointed-to node has any godparents, i.e., anypattern nodes above it in the hierarchy chain, by testing the godparentpointer of this node (specifically whether it contains, as discussedabove, a null value). If no such godparents exist, then executionproceeds, via NO path 1689 from decision block 1685, to block 1694which, when executed, assigns the new pattern node as a godparent of thepointed-to node. Once this occurs, execution exits, via path 1696, fromroutine 1600. If, however, decision block 1685 determines that thepointed-to node has a godparent, then the operations within box 1660 arerepeated but at an incrementally next higher level in this hierarchychain. Specifically, execution proceeds, via YES path 1687, to block1690 which sets POINTER to point to the godparent of the now priorpointed-to node, the godparent now being the pointed-to node for thenext iteration through the operations shown in box 1660. Execution thenloops back, via feedback path 1692, to decision block 1665, to considerthe next level of the pattern hierarchy chain, and so forth, until thenew pattern node, containing this new pattern, has been inserted at aproper hierarchical point within the chain.

iii. Replicate routine 1900

FIG. 19 depicts a flowchart of Replicate routine 1900. This routine is autility, called during pattern node insertion, for, as noted above,replicating all pattern nodes along with their supporting branch nodeswhich do not care about a value of a given bit index, i.e., the bitposition in the IMASK field has a value of zero.

Routine 1900 is entered with two parameters: POINTER and PIVOT BIT.Together, as noted above, these specify a particular node locationwithin the rhizome at which replication is to start and all patternnodes (i.e., those patterns that do not care about the value of the bitspecified by the value of PIVOT BIT), along with their supporting branchstructure, that need to be replicated off that particular node.

We will describe this routine first in a summary overview fashion thenin detail.

Essentially, routine 1900 traverses the search trie until it reaches apattern node. When the routine finds such a node, it recursively tracesthrough a pattern hierarchy chain, should it exist, for that node untila pattern node is found that has a stored pattern that does not careabout the given bit index (PIVOT BIT value). As such, by virtue ofstrict hierarchy in the chain, all godparents of this pattern node donot care about this bit index as well.

In that regard, as the routine first traverses through successive branchnodes in the trie, the routine determines whether each branch of acurrent node leads to any pattern node having a pattern that does notcare about the value of the given bit index. If only one branch leads tosuch a pattern, then the current node does not need to be replicated;hence, the routine only recurses on that one child node. Alternatively,if both branches emanating from the current node contain such a pattern,then the current node may need to be replicated; hence the routine willrecurse on both child nodes therefrom, replicating as necessary. Oncethe recursions are complete, routine 1900 determines whether theresulting structure on each side of the current node contains a patternnode having a pattern that cares about the pivot bit of the currentnode. If such a pattern node exists on both sides of the current node,then the current node is itself replicated and the children of thisreplicated node are set to the replicas returned by the prior recursivecalls to the replicate routine. If such a pattern node exists on onlyone side of the current node, then the current node is not replicatedand the replicated branch that does not lead to a pattern node thatcares about the pivot bit of the current node is eliminated. If such apattern node does not exist on either side of the current node, then onebranch is arbitrarily chosen; the other branch is eliminated.

Specifically, upon entry into routine 1900, execution first proceeds toblock 1905. This block, when executed, determines whether the pointed-tonode is a pattern node. In the non-hierarchical replicate routine, theblocks within dotted box 1930, specifically decision block 1932 andblock 1938, can be omitted. Thus, in this case, if the pointed-to nodeis a pattern node, then execution exits via YES path 1909 and dashedpath 1910, with the address of the current pattern node being returned.In the hierarchical replicate routine, if the pointed-to node is apattern node, then execution proceeds, via YES path 1919, to decisionblock 1932. This decision block determines whether the bit, indexed bythe pivot bit, in the MASK field of the pointed-to node equals one. Ifthis bit does not equal one, i.e., it equals zero, hence specifying abit for which a wildcard exists either in this pattern node or higher inthe pattern hierarchy and for which this pattern does not care, thenexecution exits routine 1900, via NO path 1936 emanating from decisionblock 1932. The address of the pointed-to node is returned.Alternatively, if this bit equals one, i.e., this pattern cares aboutthe value at this bit position, then decision block 1932 routesexecution, via YES path 1934, to block 1938. This latter blockrecursively invokes the present routine, i.e., Replicate routine 1900,but with POINTER set to specify the godparent of the current patternnode. After this recursive execution, execution exits from this routinewith an address of a god-ancestor being returned.

Alternatively, if the pointed-to node is not a pattern node, thendecision block 1905 routes execution, via NO path 1907, to decisionblock 1915. This latter block, along with decision block 1940 whichfollows, successively examines the pivot bit of the IMASK fieldassociated with each child node of the pointed-to node. Specifically,decision block 1915, when executed, tests whether the bit in the IMASKfield, indexed by the current pivot bit, of the node on the child zerobranch of the pointed-to node equals one. If it does, then all patternsaccessible through this node care about the value of this bit.Consequently, no rhizome structure on this side of the pointed-to node,needs to be replicated. Hence, decision block 1915 routes execution, viaYES path 1919, to block 1920. This latter block recursively invokesroutine 1900, but with POINTER set to specify the child one node of thepointed-to node. After this recursive execution, execution exits fromthis routine with an address of a replicate of the nodal structure onthe child one branch being returned. Alternatively, if the bit in theIMASK field, designated by the current pivot bit, of the node on thechild zero branch of the pointed-to node equals zero, indicating thatsome pattern node reachable via the child zero branch does not careabout the value of this bit, then decision block 1915 routes execution,via NO path 1917, to decision block 1940 to examine the node on thechild one branch of the pointed-to node. Specifically, decision block1940, when executed, tests whether the bit in the IMASK field,designated by the current pivot bit, of the node on the child one branchof the pointed-to node equals one. If it does, then all patternsaccessible through this node care about the value of this bit.Consequently, no nodal structure on this side of the pointed-to node,needs to be replicated. Hence, decision block 1940 routes execution, viaYES path 1945, to block 1950. This latter block recursively invokesreplicate routine 1900, but with POINTER set to specify the node on thechild zero branch of the pointed-to node. After this recursiveexecution, execution exits from this routine with an address of areplicate of the nodal structure on the child zero branch, beingreturned.

If, however, the bit in the IMASK field of the nodes on the child zeroand one branches from the pointed-to node are both equal to zero, thenit may be necessary to replicate the current node. On the other hand, itmay not be so necessary, since a replica of the current node is onlyneeded if at least one replicated pattern specifies that the pivot bitof the current node is a zero and at least one replicated patternspecifies that the pivot bit of the current node is a one. The only wayto determine this, short of storing a matrix of data at each branchnode, is to actually perform the replications of both child branches andthen to examine each resulting replica for such patterns, thenreplicating the current node, if it is required, or eliminating one ofthe newly replicated sub-structures, if it is not. Thus, decision block1940 routes execution, via NO path 1943, to block 1960. This latterblock, when executed, first recursively invokes replicate routine 1900,with, as arguments thereto, the address of the node on the child zerobranch and the current pivot bit. This causes the sub-structureextending from the child zero branch to be examined and replicated asnecessary. Once this replication completes, block 1960 recursivelyinvokes routine 1900 again, but with, as arguments thereto, the addressof the node on the child one branch and the current pivot bit. Thisrecursive operation causes the sub-structure extending from the childone branch to be examined and replicated as necessary. Once theappropriate portions of the sub-structure have been replicated,execution passes to decision block 1965.

Once the sub-structures of both child branches have been replicated,blocks 1965-1985 determine whether both replicated sub-structures arenecessary. If they are, then the current node is replicated; otherwise,the redundant sub-structure is eliminated.

In particular, decision blocks 1965 and 1975 collectively determinewhether a bit, indexed by the current pivot bit in the pointed-to node,in the MASK field, of head nodes of both replicated branches of thecurrent node, equals one, i.e., whether patterns reachable through bothof these child branches care about the value of the pivot bit of thecurrent node. To accomplish this, decision block 1965 first tests thereplicated child zero branch of the current node. If the indexed bit inthe MASK field of the head node of the replicated child zero branchequals zero, i.e., indicative of a bit that all patterns reachablethrough this node do not care about (a wildcard exists at that bitposition in all of these patterns), then decision block 1965 routesexecution, via NO path 1969, to block 1970. This latter block, whenexecuted, invokes Eliminate routine 2000 (discussed in detail below inconjunction with FIG. 20) to remove this particular replicated nodalstructure. Once block 1970 completes execution, execution exits fromroutine 1900 and the address of the replicate of the child one branch isreturned. Alternatively, if indexed bit in the MASK field of the headnode of the replicated child zero branch equals one, then decision block1965 routes execution, via YES path 1967, to decision block 1975. Thislatter decision block tests the child one branch of the replicate node.If the indexed bit in the MASK field of the head node of the replicatedchild one branch equals zero, i.e., indicative of a bit that allpatterns reachable through this node do not care about (a wildcardexists at that bit position in all of these patterns), then decisionblock 1965 routes execution, via NO path 1979, to block 1980. Thislatter block, when executed, invokes Eliminate routine 2000 to removethis particular replicated nodal structure. Once block 1980 completesexecution, execution exits from routine 1900 and the address of thereplicate of the child zero branch is returned. Alternatively, ifindexed bit in the MASK field of the head node of the replicated childone branch equals one, then decision block 1975 routes execution, viaYES path 1977, to block 1985. This latter block, when executed,replicates the current node by allocating a new branch node and settingthe contents of its fields. Specifically, the VALUE and MASK fields forthe new branch node are set to the disjunction of the VALUE and MASKfields, respectively, of the replicate head nodes. The IMASK field forthis new node is set to the conjunction of the IMASK field of thereplicate head nodes. The child pointers are set to point to therespective replicas of the child branches of the current node. Lastly,the pivot bit of this new node is set to the current pivot bit of thepointed-to node. After these operations have occurred, execution thenexits from routine 1900 with the address of the new node being returned.

3. Removal process

i. Removal routine 1700

FIG. 17 depicts a flowchart of Removal routine 1700 which is alsoexecuted by pattern classifier 410 shown in FIG. 4. This routine,combined with routine 1800, is used to delete an existing pattern nodefrom our inventive rhizome. Apart from routine 1800, which will bediscussed shortly below in conjunction with FIG. 18, routine 1700 alsoperforms relatively few operations.

In particular, routine 1700 is entered with one parameter: PATTERN,which specifies the particular pattern node that is to be removed. Uponentry into routine 1700, execution first proceeds to block 1710. Thisblock, when executed, invokes Recursive Node Remove routine 1800 with asingle argument, i.e., a pointer to the root node of the rhizome.Routine 1800 is recursively invoked, as and if needed, to remove all nowunnecessary structure that solely supports the pattern to be removed.Once routine 1800 has fully completed its execution, including allrecursive executions thereof, block 1710 completes its execution.Thereafter, block 1720 executes to deallocate the pattern node; hence,freeing memory locations previously allocated to this node forsubsequent reuse. Thereafter, execution simply exits from routine 1700.

ii. Node Remove routine 1800

FIG. 18 depicts a flowchart of Node Remove routine 1800. As notedimmediately above, routine 1800 removes, through recursive execution, ifneeded, all now unnecessary structure that solely supports the patternto be removed.

Routine 1800 is entered with one parameter:

POINTER, which specifies a pattern node which is to be removed from therhizome. To the extent any underlying structure in the rhizome existsonly to support just the pattern node being removed, then thatunderlying structure will be removed. This routine does not physicallyremove pattern nodes from memory, i.e., de-allocating memory locationsfor those nodes (which is the function of Removal routine 1700, butrather it removes pointers in the structure to the indicated patternnode.

We will describe routine 1800 first in a summary overview fashion thenin detail.

Routine 1800 performs two completely different sets of operationsdepending upon whether the current node is a branch node or a patternnode. In the non-hierarchical node remove routine, if the current nodeis a pattern node, then it must be the pattern node that is beingremoved. Therefore, the child pointer in the parent branch node thatpoints to the current node is set to null; the routine then terminates.In the hierarchical node remove routine, if the current node is apattern node, then it may be the pattern node that is being removed orit may be a god-descendent of the pattern node that is being removed.Thus, it is necessary to determine whether the current node is thepattern node to be removed. If it is, then the current node is removedfrom the chain of godparents, and the routine then terminates;otherwise, the routine recurses on the godparent of the current node,following which the routine updates the IMASK field of the current nodeto reflect changes in its god-ancestry, and the routine then terminates.

If, however, the current node is a branch node, then routine 1800determines whether the pattern to be removed cares about the pivot bitof the current node. If it does, then the routine recurses on a childnode of the current node determined by the bit of the pattern indexed bythe pivot bit of the current node. Otherwise, the routine recurses onboth child nodes of the current node. The routine then determineswhether the current node is still necessary. Specifically, if theremoved pattern does not care about the pivot bit of the current node,then the presence of this pattern in the structure could not have beensustaining the current node; hence, the current node must still benecessary. Otherwise, the routine determines whether any pattern,reachable through the same branch as the removed pattern, still existsthat cares about the value of the pivot bit of the current node. Ifthere is such a pattern, then the current node is still necessary.Alternatively, if such a pattern does not exist, then the current nodeis no longer necessary. Consequently, the current node is then removedfrom the rhizome.

Specifically, upon entry into routine 1800, execution first proceeds todecision block 1802. This block, when executed, determines, by checkingthe value of the pivot bit as discussed above, whether the pointed-tonode is a pattern node. In the event the current node is a pattern node,execution proceeds, via YES path 1803, to Pattern Node Removalsub-routine 1807. In the non-hierarchical node remove routine, theblocks within dotted box 1850, specifically decision block 1855, 1865and 1875, and blocks 1870, 1880 and 1885, can be omitted. Thus, in thiscase, if the current node is a pattern node, then execution proceeds,via YES path 1803 and dashed path 1813, to block 1860. Block 1860removes the pointed-to node, i.e., the sole pattern node in a chain ifno pattern hierarchy exists. This removal is effected by setting thechild pointer, in the parent branch node that points to the currentnode, to a null value. Thereafter, execution exits, via path 1862, fromroutine 1800.

In the hierarchical node remove routine, if the current node is apattern node, then execution proceeds, via YES path 1803, to decisionblock 1855. This decision block, when executed, determines whether thepointed-to node stores the particular pattern that is to be removed,i.e., whether this stored pattern identically matches that which is tobe removed. If it does match, then decision block 1855 routes execution,via YES path 1859, to block 1860 to remove this particular node. If,however, the pattern to be removed does not exactly match that at thepointed-to node, then decision block 1855 routes execution, via NO path1857, to decision block 1865. This latter decision block tests whetherthe pointed-to node has a godparent. If the pointed-to node has agodparent, then decision block 1865 routes execution, via YES path 1869,to block 1870. This latter block recursively invokes this routine, i.e.,Node Remove routine 1800, on the godparent of the pointed-to node, i.e.,POINTER is set to point to this godparent. Once all such recursiveexecution has completed, execution of block 1870 completes withexecution proceeding to decision block 1875. Execution also reaches thisdecision block, via NO path 1867 emanating from decision block 1865, inthe event the pointed-to node does not have a godparent. Decision block1875 again tests whether the pointed-to node has a godparent, since thisgodparent may have been just been removed through now completedrecursive execution of routine 1800 (invoked through block 1870). In theevent the pointed-to node still has a godparent, i.e., the current nodepreviously had more than one godparent, only one of which was removedvia recursive execution of routine 1800, then decision block 1875 routesexecution, via YES path 1877, to block 1880. This latter block, whenexecuted, sets the IMASK field in the pointed-to node equal to the IMASKfield of its godparent. Thereafter, execution exits from routine 1800.Alternatively, if the pointed-to node now does not have a godparent,i.e., this node is now the highest pattern node in a pattern hierarchychain, then decision block 1875 routes execution, via NO path 1879, toblock 1885. This latter block, when executed, sets the IMASK field inthe pointed-to node equal to the MASK field for this node. Thereafter,execution exits from routine 1800.

If, on the other hand, decision block 1802 determines that thepointed-to node is a branch node, then block 1802 routes execution, viaNO path 1804, to Branch Node Removal sub-routine 1810. Upon entry intosub-routine 1810, execution first proceeds to decision block 1815. Thisdecision block determines whether the pattern to be removed has a one inthe bit position of its MASK field indexed by the pivot bit of thepointed-to node, i.e., whether a wildcard exists in the associatedstored pattern at this bit position (in which case the correspondingMASK bit would be zero).

If the indexed MASK bit in the pattern to be removed is zero, then thecurrent branch node does not exist merely to support the pattern to beremoved; therefore, the current branch node will remain in thestructure. In this case, decision block 1815 routes execution, via NOpath 1817, to block 1835. This latter block, when executed, recursivelyinvokes routine 1800, i.e., Node Remove routine 1800, twice insuccession: once for one child of the current pointed-to node and theother for the other child of that node. Once all such recursiveexecutions have completed, block 1835 terminates its execution withexecution passing to block 1840. This latter block updates the VALUE,MASK and IMASK fields of the pointed-to node. Specifically, the VALUEand MASK fields of the pointed-to node are set equal to the disjunctionof the VALUE and MASK fields of both of its child nodes. The IMASK fieldof the pointed-to node is set equal to the conjunction of IMASK fieldsof both of its child nodes. Once this occurs, execution exits fromroutine 1800.

However, if decision block 1815 determines that the pattern to beremoved has a one-valued MASK bit at the bit position indexed by thepivot bit of the pointed-to node, i.e., a wildcard does not exist atthis bit position in the pattern, then decision block 1815 routesexecution, via YES path 1819, to block 1820. Hence, the search for thepattern to be removed can be confined to one child branch of the currentnode. Consequently, block 1820, when executed, recursively invokes thisroutine, i.e., Node Remove routine 1800, with, as an argument, the childnode selected by a bit of the VALUE field of the pattern to be removedand indexed by the pivot bit in the pointed-to node. Once this recursiveexecution has fully completed to remove appropriate structure on oneside of the pointed-to node, execution proceeds to decision block 1825.Through a two-part test, this decision block determines whether thecurrent pointed-to branch node itself can be removed. In particular,block 1825 first determines whether, given the value of a bit of theVALUE field of the pattern to be removed indexed by the pivot bit, thepointed-to branch node has a corresponding child node. Second, thisblock determines whether a bit in the MASK field of that particularchild node, at a bit position indexed by the pivot bit of the pointed-tonode, is a one, i.e., whether any pattern node reachable through thisbranch node has a pattern that cares about the value of this particularbit position therein. If both tests are satisfied, then the currentpointed-to branch node is necessary and can not be removed from therhizome. In this case, decision block 1825 routes execution, via YESpath 1827, to block 1840 to appropriately update, as described above,the fields associated with the current pointed-to branch node; afterwhich, execution exits from routine 1800. Alternatively, if the currentpointed-to branch node does not satisfy both of the conditions tested bydecision block 1825, then the current pointed-to branch node isunnecessary and can be removed in its entirety without any adverseaffects on the remaining rhizome structure. In this case, decision block1825 routes execution, via NO path 1829, to block 1830. This latterblock, when executed, replaces the current pointed-to branch node by oneof its child nodes other than the corresponding child node selectedwithin decision block 1825. Thereafter, the child node selected bydecision block 1825 is eliminated. Finally, the current pointed-tobranch node is de-allocated. Once block 1830 has fully executed,execution then exits from routine 1800.

iii. Eliminate routine 2000

FIG. 20 depicts a flowchart of Eliminate routine 2000. This routine is autility, called during pattern node removal and replication, usuallyrecursively for, as noted above, removing entire sub-structure rooted ata given node. This routine eliminates such sub-structure for either oftwo reasons: (a) that chain has become unnecessary during node removal,i.e., the pattern node which it supports is being removed; or (b) duringthe course of replicating rhizome structure, a redundant branch chainhas been created and, to simplify the rhizome, should be collapsed.Essentially, this routine recursively traverses nodal structure,stopping at pattern nodes. As the routine moves back up a recursionstack, the routine deallocates each branch node it encountered. Thisroutine does not eliminate pattern nodes; that is provided, as discussedabove, through Removal routine 1800.

In particular, routine 2000 is entered with one parameter: POINTER,which specifies the particular branch node that is to be eliminated.Upon entry into routine 2000, execution first proceeds to decision block2010. This decision block tests whether POINTER points to a branch nodeor a pattern node--inasmuch as this routine will only remove the former.If the pointed-to node, specified by the value of POINTER, is a patternnode, then execution simply exits from routine 2000, via YES path 2017emanating from decision block 2010. Alternatively, if the pointed-tonode is a branch node, then decision block 2010 routes execution, via NOpath 2013, to block 2020. This block recursively invokes this sameroutine, i.e., Eliminate routine 2000, but with the value of POINTER setto the child zero node of the pointed-to node. After this recursioncompletes, execution then proceeds to block 2030 which recursivelyinvokes this same routine, i.e., Eliminate routine 2000, but with thevalue of POINTER set to the child one node of the pointed-to node. Afterthis recursion completes, execution then proceeds to block 2040, whichdeallocates all memory locations assigned to this branch node.

This routine effectively performs a postorder traversal of the searchtrie rooted at a given branch node, deallocating each branch node afterall branch nodes deeper in the structure have been deallocated.

Asymptotically, average search time in the rhizome is a logarithmicfunction of a number of elements stored in the rhizome rather thanlinear in the number of elements, as occurs with conventional databasesof patterns containing wildcards. Hence, use of our inventive rhizome,particularly with a large classification database--as frequently occursin packet classifiers, appears to provide significant time savings overconventional classification techniques known in the art.

By now, it should be clearly apparent to one skilled in the art that ifno wildcard-based patterns exist or are to be inserted in our inventiverhizome, then the remaining rhizome structure would be identical to aconventional Patricia tree--thereby providing full backwardcompatibility therewith. Furthermore, when operating on our inventiverhizome in the absence of any wildcard-based patterns, our inventiveretrieval, insertion and removal processes would each advantageouslyoperate with substantially, if not identically, the same degree ofcomputational complexity as would corresponding processes for aconventional Patricia tree. Therefore, in the absence of anywildcard-based patterns, use of our inventive rhizome does not engenderany performance penalty over use of a conventional Patricia tree.

Furthermore, though we described our search trie and hierarchy forest asbeing search and hierarchy structures stored in memory, specificallymemory 340 shown in FIG. 3, these structures can just as easily bestored in any computer readable media, such as a floppy or othertransportable media, e.g., a magnetic or optical disk, or even insuitable non-volatile memory, such as flash memory.

Although two embodiments which incorporate the teachings of our presentinvention have been shown and described in considerable detail herein,those skilled in the art can readily devise many other embodiments thatstill utilize these teachings.

We claim:
 1. A method for use in a packet classifier, the classifierhaving a data structure with hierarchically-related pattern valuesstored therein, said pattern values having wildcards therein; whereinthe data structure has a search trie and a hierarchy forest, the searchtrie being formed of a plurality of branch nodes organized into a binarysearch trie extending from a root node therein to said hierarchy forest,the hierarchy forest having a plurality of pattern nodes, each having adifferent corresponding one of the pattern values associated therewith,and at least one hierarchy of pattern nodes with associated patternvalues of increasing generality; the method comprising the stepsof:forming a key in response to information contained in a portion ofthe packet; traversing along a path through the search trie in order toreach one of the pattern nodes in said hierarchy forest, the path beingdefined by the key and pivot bit values associated with ones of saidbranch nodes; determining whether the key either matches or is subsumedby a pattern value stored in said one pattern node or in a different oneof pattern nodes situated at an increasingly higher level of a hierarchycontaining said one pattern node, so as to locate a desired one of thepattern values, stored in the hierarchy, that either identically matchesor most specifically subsumes the key; and if the key identicallymatches or is subsumed by the desired one pattern value:accessing astored classification associated with said desired one pattern value;and returning the stored classification for the packet.
 2. The method inclaim 1 wherein the classification is a designation of a transmissionqueue.
 3. The method in claim 2 wherein the accessing step furthercomprises the steps of:accessing a reference field associated with apattern node storing said desired one pattern value so as to yield astored designation of an associated transmission queue as theclassification; and returning the designation of the associatedtransmission queue for the packet.
 4. The method in claim 3 wherein theportion of the packet is a packet header, and the information comprisessource and destination addresses and source and destination portdesignations.
 5. The method in claim 1 wherein said traversing stepcomprises the step of tracing along the path by branching from eachsuccessive branch node encountered along said path, starting at the rootnode, in a direction therefrom as specified by a bit of the key indexedby the pivot bit of said each branch node until said one pattern node isencountered.
 6. The method in claim 5 wherein the tracing step comprisesthe steps of:examining, when the path reaches said successive branchnode, a particular bit in the key indexed by a value of the pivot bitassociated with said successive branch node; and branching, as specifiedby a value of the particular bit along either a child zero branch or achild one branch, both branches emanating from the successive branchnode, to reach a child one or child zero node as a next node along saidpath, said next node being either a different one of the branch nodes,having a higher pivot bit associated therewith than that of thesuccessive branch node, or the one pattern node.
 7. The method in claim6 further comprising the step of:returning a message, indicative of theno match condition, in the event said key did not match nor was subsumedby the pattern value stored in said one pattern node nor in any of thepattern values stored at all increasingly higher levels of thehierarchy.
 8. The method in claim 7 further comprising the step ofinserting, in response to a new transmission queue being established, anew pattern node, into said hierarchy forest, for storing a new patternvalue in the data structure, said new pattern node having a designationof the new queue associated therewith.
 9. The method in claim 8 whereinthe classification is a designation of a transmission queue.
 10. Themethod in claim 9 wherein the accessing step further comprises the stepsof:accessing a reference field associated with a pattern node storingsaid desired one pattern value so as to yield a stored designation of anassociated transmission queue as the classification; and returning thedesignation of the associated transmission queue for the packet.
 11. Themethod in claim 10 wherein the portion of the packet is a packet headerand the information comprises source and destination addresses.
 12. Themethod in claim 8 wherein said inserting step further comprises thesteps of:inserting said new pattern node at a proper hierarchicallocation within said hierarchy forest, said inserting comprises updatingstored parameter values associated with existing ones of the nodes inthe data structure such that said new pattern node is situated properlyand is fully reachable therethrough; and if said hierarchy forestcontains a wildcard-based pattern, replicating sufficient structure insaid search trie, where necessary, based on bit positions where bitdisagreements occur among bits in corresponding bit positions in saidnew pattern and existing pattern values stored in the data structure,such that the new pattern node is reachable over all different pathsthat, as a result of a wildcard, can be extended through the search trieto the new pattern node.
 13. The method in claim 12 wherein each of saidbranch nodes in said search trie has a corresponding data structureassociated therewith, the replicating step comprises the step ofdefining a data structure for each new branch node formed in said searchtrie, wherein the defining step comprises the step of:storing, incorresponding memory locations in said data structure for said newbranch node:a corresponding one of the pivot bits associated with saidnew branch node; child zero and child one pointers to child one andchild zero nodes; and values for VALUE, MASK and IMASK parameters. 14.The method in claim 13 wherein said storing step further comprises thesteps of:forming the VALUE parameter as a first predefined function ofthe VALUE parameters of the child one and zero nodes emanating from saidnew branch node; forming the MASK parameter as a second predefinedfunction of MASK parameters of the child one and child zero nodesemanating from said new branch node; and forming the IMASK parameter asa third predefined function of IMASK parameters of the child one andchild zero nodes emanating from said new branch node.
 15. The method inclaim 14 wherein said first and second functions are each a disjunctionfunction, and said third function is a conjunction function.
 16. Themethod in claim 15 wherein said first and second functions is thedisjunction function taken up to the pivot bit position for said onebranch node, and the third function is the conjunction function taken upto the pivot bit position for said one branch node.
 17. The method inclaim 12 wherein each of said pattern nodes in said hierarchy forest hasa corresponding data structure associated therewith, the inserting stepcomprises the step of defining a data structure for each new patternnode formed in said hierarchy forest, wherein the defining stepcomprises the step of storing, in corresponding memory locations in saiddata structure:a reference value; a pointer to a next higher patternnode in a hierarchy containing said new pattern node; and values forVALUE, MASK and IMASK parameters.
 18. The method in claim 17 whereinsaid storing step further comprises the steps of:setting the VALUEparameter equal to a new pattern value to be stored in said each newpattern node; forming the MASK parameter to specify a wildcard in anybit position in said VALUE parameter for said each new pattern node; andforming the IMASK parameter as a fourth predefined function of the MASKparameter of said each new pattern node and the IMASK parameter of agodparent to said each new pattern node.
 19. The method in claim 18wherein said fourth function is a conjunction function.
 20. The methodin claim 7 further comprising the step of removing an existing patternnode, from said hierarchy forest, so as to remove a pattern value, and aflow, associated therewith from said data structure.
 21. The method inclaim 20 wherein the removing step comprises the steps of:eliminatingstructure from the search trie solely needed to previously support saidexisting pattern node; and updating stored parameter values associatedwith remaining ones of the nodes in the data structure such that theexisting pattern node is fully removed from the data structure and isnot reachable therein.
 22. The method in claim 21 wherein each branchnode in the search trie has a corresponding data structure associatedtherewith, the data structure storing in separate fields therein: acorresponding one of the pivot bits associated with said each branchnode; child zero and child one pointers to child one and child zeronodes; and values for VALUE, MASK and IMASK parameters; and wherein saidupdating step for, a remaining one of the branch nodes in the forest,comprises the steps of:forming the VALUE parameter as a first predefinedfunction of the VALUE parameters of the child one and zero nodesemanating from said remaining one branch node; forming the MASKparameter as a second predefined function of MASK parameters of thechild one and child zero nodes emanating from said remaining one branchnode; and forming the IMASK parameter as a third predefined function ofIMASK parameters of the child one and child zero nodes emanating fromsaid remaining one branch node.
 23. The method in claim 22 wherein saidfirst and second functions are each a disjunction function, and saidthird function is a conjunction function.
 24. The method in claim 23wherein said first and second functions is the disjunction functiontaken up to the pivot bit position for said remaining one branch node,and the third function is the conjunction function taken up to the pivotbit position for said remaining one branch node.
 25. The method in claim21 wherein each pattern node in the hierarchy forest has a correspondingdata structure associated therewith, the data structure storing inseparate fields therein: a reference value; a pointer to a next higherpattern node in a hierarchy containing said new pattern node; and valuesfor VALUE, MASK and IMASK parameters, wherein said updating step for, aremaining one of the pattern nodes in the data structure, comprises thestep of forming the IMASK parameter as a fourth predefined function ofthe MASK parameter of said remaining one pattern node and the IMASKparameter of a godparent to said remaining one pattern node.
 26. Themethod in claim 25 wherein said fourth function is a conjunctionfunction.
 27. A computer readable medium having computer executableinstructions stored therein for performing the steps recited in claim 1.28. Apparatus for a packet classifier comprising:a processor; a memoryhaving executable instructions and a data structure stored therein, thedata structure having a search trie and a hierarchy forest, said searchtrie being formed of a plurality of branch nodes organized into a binarysearch trie extending from a root node therein to said hierarchy forest,said hierarchy forest having a plurality of pattern nodes, each of saidpattern nodes having a different corresponding one of the pattern valuesassociated therewith, organized into at least one hierarchy of patternnodes of increasing generality wherein at least one of the patternvalues in the hierarchy contains a wildcard; wherein the processor, inresponse to the stored instructions:forms a key in response toinformation contained in a portion of the packet traverses along a paththrough the search trie in order to reach one of the pattern nodes insaid hierarchy forest, the path being defined by the key and pivot bitvalues associated with ones of said branch nodes; determines whether thekey either matches or is subsumed by a pattern value stored in said onepattern node or in a different one of pattern nodes situated at anincreasingly higher level of a hierarchy containing said one patternnode, so as to locate a desired one of the pattern values, stored in thehierarchy, that either identically matches or most specifically subsumesthe key; and if the key identically matches or is subsumed by thedesired one pattern value:accesses a stored classification associatedwith said desired one pattern value; and returns the storedclassification for the packet.
 29. The apparatus in claim 28 wherein theclassification is a designation of a transmission queue.
 30. Theapparatus in claim 29 wherein the processor, in response to the storedinstructions:accesses a reference field associated with a pattern nodestoring said desired one pattern value so as to yield a storeddesignation of an associated transmission queue as the classification;and returns the designation of the associated transmission queue for thepacket.
 31. The apparatus in claim 30 wherein the portion of the packetis a packet header, and the information comprises source and destinationaddresses and source and destination port designations.
 32. Theapparatus in claim 28 wherein the processor, in response to the storedinstructions, traces along the path by branching from each successivebranch node encountered along said path, starting at the root node, in adirection therefrom as specified by a bit of the key indexed by thepivot bit of said each branch node until said one pattern node isencountered.
 33. The apparatus in claim 32 wherein the processor, inresponse to the stored instructions:examines, when the path reaches saidsuccessive branch node, a particular bit in the key indexed by a valueof the pivot bit associated with said successive branch node; andbranches, as specified by a value of the particular bit along either achild zero branch or a child one branch, both branches emanating fromthe successive branch node, to reach a child one or child zero node as anext node along said path, said next node being either a different oneof the branch nodes, having a higher pivot bit associated therewith thanthat of the successive branch node, or the one pattern node.
 34. Theapparatus in claim 33 wherein the processor, in response to the storedinstructions, returns a message, indicative of the no match condition,in the event said key did not match nor was subsumed by the patternvalue stored in said one pattern node nor in any of the pattern valuesstored at all increasingly higher levels of the hierarchy.
 35. Theapparatus in claim 34 wherein the processor, in response to the storedinstructions, inserts, in response to a new transmission queue beingestablished, a new pattern node, into said hierarchy forest, for storinga new pattern value in the data structure, said new pattern node havinga designation of the new transmission queue associated therewith. 36.The apparatus in claim 35 wherein the classification is a designation ofa transmission queue.
 37. The apparatus in claim 36 wherein theprocessor, in response to the stored instructions:accesses a referencefield associated with a pattern node storing said desired one patternvalue so as to yield a stored designation of an associated transmissionqueue as the classification; and returns the designation of theassociated transmission queue for the packet.
 38. The apparatus in claim37 wherein the portion of the packet is a packet header, and theinformation comprises source and destination addresses and source anddestination port designations.
 39. The apparatus in claim 35 wherein theprocessor, in response to the stored instructions:inserts said newpattern node at a proper hierarchical location within said hierarchyforest, said inserting comprises updating stored parameter valuesassociated with existing ones of the nodes in the data structure suchthat said new pattern node is situated properly and is fully reachabletherethrough; and if said hierarchy forest contains a wildcard-basedpattern, replicates sufficient structure in said search trie, wherenecessary, based on bit positions where bit disagreements occur amongbits in corresponding bit positions in said new pattern and existingpattern values stored in the data structure, such that the new patternnode is reachable over all different paths that, as a result of awildcard, can be extended through the search trie to the new patternnode.
 40. The apparatus in claim 39 wherein each of said branch nodes insaid search trie has a corresponding data structure associatedtherewith, and wherein the processor, in response to the storedinstructions, defines a data structure for each new branch node formedin said search trie, and through so doing, stores in correspondingmemory locations in said data structure for said new branch node:acorresponding one of the pivot bits associated with said new branchnode; child zero and child one pointers to child one and child zeronodes; and values for VALUE, MASK and IMASK parameters.
 41. Theapparatus in claim 40 wherein the processor, in response to the storedinstructions:forms the VALUE parameter as a first predefined function ofthe VALUE parameters of the child one and zero nodes emanating from saidnew branch node; forms the MASK parameter as a second predefinedfunction of MASK parameters of the child one and child zero nodesemanating from said new branch node; and forms the IMASK parameter as athird predefined function of IMASK parameters of the child one and childzero nodes emanating from said new branch node.
 42. The apparatus inclaim 41 wherein said first and second functions are each a disjunctionfunction, and said third function is a conjunction function.
 43. Theapparatus in claim 42 wherein said first and second functions is thedisjunction function taken up to the pivot bit position for said onebranch node, and the third function is the conjunction function taken upto the pivot bit position for said one branch node.
 44. The apparatus inclaim 39 wherein each of said pattern nodes in said hierarchy forest hasa corresponding data structure associated therewith, and the processor,in response to the stored instructions, defines a data structure foreach new pattern node formed in said hierarchy forest, and, through sodoing, stores, in corresponding memory locations in said datastructure:a reference value; a pointer to a next higher pattern node ina hierarchy containing said new pattern node; and values for VALUE, MASKand IMASK parameters.
 45. The apparatus in claim 44 wherein theprocessor, in response to the stored instructions:sets the VALUEparameter equal to a new pattern value to be stored in said each newpattern node; forms the MASK parameter to specify a wildcard in any bitposition in said VALUE parameter for said each new pattern node; andforms the IMASK parameter as a fourth predefined function of the MASKparameter of said each new pattern node and the IMASK parameter of agodparent to said each new pattern node.
 46. The apparatus in claim 45wherein said fourth function is a conjunction function.
 47. Theapparatus in claim 34 wherein the processor, in response to the storedinstructions, removes an existing pattern node, from said hierarchyforest, so as to remove a pattern value, and a flow, associatedtherewith from said data structure.
 48. The apparatus in claim 47wherein the processor, in response to the stored instructions:eliminatesstructure from the search trie solely needed to previously support saidexisting pattern node; and updates stored parameter values associatedwith remaining ones of the nodes in the data structure such that theexisting pattern node is fully removed from the data structure and isnot reachable therein.
 49. The apparatus in claim 48 wherein each branchnode in the search trie has a corresponding data structure associatedtherewith, the data structure storing in separate fields therein: acorresponding one of the pivot bits associated with said each branchnode; child zero and child one pointers to child one and child zeronodes; and values for VALUE, MASK and IMASK parameters; and wherein theprocessor, in response to the stored instructions, for, a remaining oneof the branch nodes in the forest:forms the VALUE parameter as a firstpredefined function of the VALUE parameters of the child one and zeronodes emanating from said remaining one branch node; forms the MASKparameter as a second predefined function of MASK parameters of thechild one and child zero nodes emanating from said remaining one branchnode; and forms the IMASK parameter as a third predefined function ofIMASK parameters of the child one and child zero nodes emanating fromsaid remaining one branch node.
 50. The apparatus in claim 49 whereinsaid first and second functions are each a disjunction function, andsaid third function is a conjunction function.
 51. The apparatus inclaim 50 wherein said first and second functions is the disjunctionfunction taken up to the pivot bit position for said remaining onebranch node, and the third function is the conjunction function taken upto the pivot bit position for said remaining one branch node.
 52. Theapparatus in claim 48 wherein each pattern node in the hierarchy foresthas a corresponding data structure associated therewith, the datastructure storing in separate fields therein: a reference value; apointer to a next higher pattern node in a hierarchy containing said newpattern node; and values for VALUE, MASK and IMASK parameters, whereinthe processor, in response to the stored instructions and for aremaining one of the pattern nodes in the data structure, forms theIMASK parameter as a fourth predefined function of the MASK parameter ofsaid remaining one pattern node and the IMASK parameter of a godparentto said remaining one pattern node.
 53. The apparatus in claim 52wherein said fourth function is a conjunction function.